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PURE MATHEMATICS 


This book was written to provide a basic but rigorous introduction to pure mathematics, while exposing 
students to a wide range of mathematical topics in logic, set theory, abstract algebra, number theory, 
real analysis, topology, complex analysis, and linear algebra. 


For students: There are no prerequisites for this book. The content is completely self-contained. 
Students with a bit of mathematical knowledge may have an easier time getting through some of the 
material, but no such knowledge is necessary to read this book. 


More important than mathematical knowledge is “mathematical maturity.” Although there is no single 
agreed upon definition of mathematical maturity, one reasonable way to define it is as “one’s ability 
to analyze, understand, and communicate mathematics.” A student with a higher level of mathematical 
maturity will be able to move through this book more quickly than a student with a lower level of 
mathematical maturity. 


Whether your level of mathematical maturity is low or high, if you are just starting out in pure 
mathematics, then you’re in the right place. If you read this book the “right way,” then your level of 
mathematical maturity will continually be increasing. This increased level of mathematical maturity will 
not only help you to succeed in advanced math courses, but it will improve your general problem 
solving and reasoning skills. This will make it easier to improve your performance in college, in your 
professional life, and on standardized tests such as the SAT, ACT, GRE, and GMAT. 


So, what is the “right way” to read this book? Simply reading each lesson from end to end without any 
further thought and analysis is not the best way to read the book. You will need to put in some effort 
to have the best chance of absorbing and retaining the material. When a new theorem is presented, 
don’t just jump right to the proof and read it. Think about what the theorem is saying. Try to describe 
it in your own words. Do you believe that it is true? If you do believe it, can you give a convincing 
argument that it is true? If you do not believe that it is true, try to come up with an example that shows 
it is false, and then figure out why your example does not contradict the theorem. Pick up a pen or 
pencil. Draw some pictures, come up with your own examples, and try to write your own proof. 


You may find that this book goes into more detail than other math books when explaining examples, 
discussing concepts, and proving theorems. This was done so that any student can read this book, and 
not just students that are naturally gifted in mathematics. So, it is up to you as the student to try to 
answer questions before they are answered for you. When a new definition is given, try to think of your 
own examples before looking at those presented in the book. And when the book provides an example, 
do not just accept that it satisfies the given definition. Convince yourself. Prove it. 


Each lesson is followed by a Problem Set. The problems in each Problem Set have been organized into 
five levels, Level 1 problems being considered the easiest, and Level 5 problems being considered the 
most difficult. If you want to get just a small taste of pure mathematics, then you can work on the 
easier problems. If you want to achieve a deeper understanding of the material, take some time to 
struggle with the harder problems. 


For instructors: This book can be used for a wide range of courses. Although the lessons can be taught 
in the order presented, they do not need to be. The lessons cycle twice among eight subject areas: 
logic, set theory, abstract algebra, number theory, real analysis, topology, complex analysis, and linear 
algebra. 


Lessons 1 through 8 give only the most basic material in each of these subjects. Therefore, an instructor 
that wants to give a brief glimpse into a wide variety of topics might want to cover just the first eight 
lessons in their course. 


Lessons 9 through 16 cover material in each subject area that the author believes is fundamental to a 
deep understanding of that particular subject. 


For a first course in higher mathematics, a high-quality curriculum can be created by choosing among 
the 16 lessons contained in this book. 


As an example, an introductory course focusing on logic, set theory, and real analysis might cover 
Lessons 1, 2, 5, 9, 10, and 13. Lessons 1 and 9 cover basic sentential logic and proof theory, Lessons 2 
and 10 cover basic set theory including relations, functions, and equinumerosity, and Lessons 5 and 13 
cover basic real analysis up through a rigorous treatment of limits and continuity. The first three lessons 
are quite basic, while the latter three lessons are at an intermediate level. Instructors that do not like 
the idea of leaving a topic and then coming back to it later can cover the lessons in the following order 
without issue: 1, 9, 2, 10, 5, and 13. 


As another example, a course focusing on algebraic structures might cover Lessons 2, 3, 4, 5, 10, and 
11. As mentioned in the previous paragraph, Lessons 2 and 10 cover basic set theory. In addition, 
Lessons 3, 4, 5, and 11 cover semigroups, monoids, groups, rings, and fields. Lesson 4, in addition to a 
preliminary discussion on rings, also covers divisibility and the principle of mathematical induction. 
Similarly, Lesson 5, in addition to a preliminary discussion on fields, provides a development of the 
complete ordered field of real numbers. These topics can be included or omitted, as desired. Instructors 
that would also like to incorporate vector spaces can include part or all of Lesson 8. 


The author strongly recommends covering Lesson 2 in any introductory pure math course. This lesson 
fixes some basic set theoretical notation that is used throughout the book and includes some important 
exposition to help students develop strong proof writing skills as quickly as possible. 


The author welcomes all feedback from instructors. Any suggestions will be considered for future 
editions of the book. The author would also love to hear about the various courses that are created 
using these lessons. Feel free to email Dr. Steve Warner with any feedback at 


steve @SATPrepGet800.com 


LESSON 1 - LOGIC 
STATEMENTS AND TRUTH 


Statements with Words 
A statement (or proposition) is a sentence that can be true or false, but not both simultaneously. 
Example 1.1: “Mary is awake” is a statement because at any given time either Mary is awake or Mary 


is not awake (also known as Mary being asleep), and Mary cannot be both awake and asleep at the 
same time. 


Example 1.2: The sentence “Wake up!” is not a statement because it cannot be true or false. 


An atomic statement expresses a single idea. The statement “Mary is awake” that we discussed above 
is an example of an atomic statement. Let’s look at a few more examples. 
Example 1.3: The following sentences are atomic statements: 

1. 17 isa prime number. 

2. George Washington was the first president of the United States. 

3. 5>6. 

4. Davidis left-handed. 


Sentences 1 and 2 above are true, and sentence 3 is false. We can’t say for certain whether sentence 4 
is true or false without knowing who David is. However, it is either true or false. It follows that each of 
the four sentences above are atomic statements. 


We use logical connectives to form compound statements. The most commonly used logical 
connectives are “and,” “or,” “if...then,” “if and only if,” and “not.” 
Example 1.4: The following sentences are compound statements: 

1. 17 isa prime number and 0 = 1. 

2. Michael is holding a pen or water is a liquid. 

3. If Joanna has a cat, then fish have lungs. 

4. Albert Einstein is alive today if and only if 5 + 7 = 12. 

5. 16 is not a perfect square. 


Sentence 1 above uses the logical connective “and.” Since the statement “0 = 1” is false, it follows that 
sentence 1 is false. It does not matter that the statement “17 is a prime number” is true. In fact, “T 
and F” is always F. 


Sentence 2 uses the logical connective “or.” Since the statement “water is a liquid” is true, it follows 
that sentence 2 is true. It does not even matter whether Michael is holding a pen. In fact, “T or T” is 
always true and “F or T” is always T. 


It’s worth pausing for a moment to note that in the English language the word “or” has two possible 
meanings. There is an “inclusive or” and an “exclusive or.” The “inclusive or” is true when both 
statements are true, whereas the “exclusive or” is false when both statements are true. In 
mathematics, by default, we always use the “inclusive or” unless we are told to do otherwise. To some 
extent, this is an arbitrary choice that mathematicians have agreed upon. However, it can be argued 
that it is the better choice since it is used more often and it is easier to work with. Note that we were 
assuming use of the “inclusive or” in the last paragraph when we said, “In fact, “T or T” is always true.” 
See Problem 4 below for more on the “exclusive or.” 


Sentence 3 uses the logical connective “if...then.” The statement “fish have lungs” is false. We need to 
know whether Joanna has a cat in order to figure out the truth value of sentence 3. If Joanna does have 
a cat, then sentence 3 is false (“if T, then F” is always F). If Joanna does not have a cat, then sentence 
3 is true (“if F, then F” is always T). 


Sentence 4 uses the logical connective “if and only if.” Since the two atomic statements have different 
truth values, it follows that sentence 4 is false. In fact, “F if and only if T” is always F. 


Sentence 5 uses the logical connective “not.” Since the statement “16 is a perfect square” is true, it 
follows that sentence 5 is false. In fact, “not T” is always F. 


Notes: (1) The logical connectives “and,” “or,” “if...then,” and “if and only if,” are called binary 
connectives because they join two statements (the prefix “bi” means “two”). 


(2) The logical connective “not” is called a unary connective because it is applied to just a single 
statement (“unary” means “acting on a single element”). 
Example 1.5: The following sentences are not statements: 

1. Are you happy? 

2. Go away! 

3. x-5=7 

4. This sentence is false. 

5. This sentence is true. 


Sentence 1 above is a question and sentence 2 is a command. Sentence 3 has an unknown variable — it 
can be turned into a statement by assigning a value to the variable. Sentences 4 and 5 are 
self-referential (they refer to themselves). They can be neither true nor false. Sentence 4 is called the 
Liar’s paradox and sentence 5 is called a vacuous affirmation. 


Statements with Symbols 


We will use letters such as p, q, r, and s to denote atomic statements. We sometimes call these letters 
propositional variables, and we will generally assign a truth value of T (for true) or F (for false) to each 
propositional variable. Formally, we define a truth assignment of a list of propositional variables to be 
a choice of T or F for each propositional variable in the list. 
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We use the symbols A, V, >, ©, and ~n for the most common logical connectives. The truth value of a 
compound statement is determined by the truth values of its atomic parts together with applying the 
following rules for the connectives. 


p Aq is called the conjunction of p and q. It is pronounced “p and q.” p Aq is true when both 
p and q are true, and it is false otherwise. 


p V q is called the disjunction of p and q. It is pronounced “p or q.” p V q is true when p or q 
(or both) are true, and it is false when p and q are both false. 


p > q is called a conditional or implication. It is pronounced “if p, then q” or “p implies q.” 
p > q is true when p is false or q is true (or both), and it is false when p is true and q is false. 


p © q is called a biconditional. It is pronounced “p if and only if q.” p © q is true when p and 
q have the same truth value (both true or both false), and it is false when p and q have opposite 
truth values (one true and the other false). 


ap is called the negation of p. It is pronounced “not p.” ~p is true when p is false, and it is false 
when p is true (p and ~p have opposite truth values.) 


Example 1.6: Let p represent the statement “Fish can swim,” and let q represent the statement 
“7 < 3.” Note that p is true and q is false. 


1. 


2 
3. 
4 


p Aq represents “Fish can swim and 7 < 3.” Since q is false, it follows that p A q is false. 
p V q represents “Fish can swim or 7 < 3.” Since p is true, it follows that p V q is true. 
p > q represents “If fish can swim, then 7 < 3.” Since p is true and q is false, p > q is false. 


p © q represents “Fish can swim if and only if 7 < 3.” Since p is true and q is false, p e q is 
false. 


~q represents the statement “7 is not less than 3.” This is equivalent to “7 is greater than or 
equal to 3,” or equivalently, “7 > 3.” Since q is false, ~q is true. 


ap V q represents the statement “Fish cannot swim or 7 < 3.” Since ap and q are both false, 
ap V q is false. Note that =p V q always means (~p) V q. In general, without parentheses 
present, we always apply negation before any of the other connectives. 


—(p V q) represents the statement “It is not the case that either fish can swim or 7 < 3.” This 
can also be stated as “Neither can fish swim nor is 7 less than 3.” Since p V q is true (see 2 
above), a(p V q) is false. 


ap A nq represents the statement “Fish cannot swim and 7 is not less than 3.” This statement 
can also be stated as “Neither can fish swim nor is 7 less than 3.” Since this is the same 
statement as in 7 above, it should follow that ap A-7q is equivalent to ~(p V q). After 
completing this lesson, you will be able to verify this. For now, let’s observe that since ap is 
false, it follows that =p A ~q is false. This agrees with the truth value we got in 7. (Note: The 
equivalence of ap A ~q with ~(p V q) is one of De Morgan’s laws. These laws will be explored 
further in Lesson 9. See also Problem 3 below.) 
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Truth Tables 


A truth table can be used to display the possible truth values of a compound statement. We start by 
labelling the columns of the table with the propositional variables that appear in the statement, 
followed by the statement itself. We then use the rows to run through every possible combination of 
truth values for the propositional variables followed by the resulting truth values for the compound 
statement. Let’s look at the truth tables for the five most common logical connectives. 


p | a |p^a p | a |pva p | qa |p>4 
T | T | T T | T | T T | T | T 
T | F F T | F | T T |F F 
F | T | F F TIT F | T | T 
F F F F F F F F | T 

rir r p [> 

T | F 

T F F == 

F | T F 

F | F T 


We can use these five truth tables to compute the truth values of compound statements involving the 
five basic logical connectives. 


Note: For statements involving just 1 propositional variable (such as ~p), the truth table requires 2 
rows, 1 for each truth assignment of p ( T or F). 


For statements involving 2 propositional variables (such as p A q), the truth table requires 2 - 2 = 4 (or 
2? = 4) rows, as there are 4 possible combinations for truth assignments of p and q ( TT, TF, FT, FF). 


In general, for a statement involving n propositional variables, the truth table will require 2” rows. For 
example, if we want to build an entire truth table for ap V (~q > r), we will need 2? = 2-2-2 =8 
rows in the truth table. We will create the truth table for this statement in Example 1.8 below (see the 
third solution). 


Example 1.7: If p is true and q is false, then we can compute the truth value of p A q by looking at the 
second row of the truth table for the conjunction. 


"e| aj) 4is 
"| = Be > | er A) 


|) | | > 


We see from the highlighted row that p \g = TAF £F. 
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Note: Here the symbol = can be read “is logically equivalent to.” So, we see that if p is true and q is 
false, then p Aq is logically equivalent to F, or more simply, p A q is false. 


Example 1.8: Let p, g, and r be propositional variables with p and q true, and r false. Let’s compute 
the truth value of ap V (nq > r). 


Solution: We have ap V (nq > rT) = ATV (AT > F) = FV(FOF)=FVTS=T. 


Notes: (1) For the first equivalence, we simply replaced the propositional variables by their given truth 
values. We replaced p and q by T, and we replaced r by F. 


(2) For the second equivalence, we used the first row of the truth table for the 
negation (drawn to the right for your convenience). 


We see from the highlighted row that ~T = F. We applied this result twice. 


(3) For the third equivalence, we used the fourth row of the truth table for the conditional. 


p q |Pp~q 
T T T 
T F 
F T 


We see from the highlighted row that F > F =T. 


(4) For the last equivalence, we used the third row of the truth table for the disjunction. 


p q | Pvq 
T T T 
T F T 
F F F 


We see from the highlighted row that FVT =T. 


(5) We can save a little time by immediately replacing the negation of a propositional variable by its 
truth value (which will be the opposite truth value of the propositional variable). For example, since p 
has truth value T, we can replace ap by F. The faster solution would look like this: 


ap V (nq >r)=Fv(F>F)=FVvT=&T. 


Quicker solution: Since q has truth value T, it follows that ~q has truth value F. So, ~q > r has truth 
value T. Finally, ap V (~q > r) must then have truth value T. 


Notes: (1) Symbolically, we can write the following: 


ap V (nq >r)= apV(AT > 1r) = apV(F or) =apvt=tT 
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(2) We can display this reasoning visually as follows: 


The vertical lines have just been included to make sure you see which connective each truth value is 
written below. 


We began by placing a T under the propositional variable q to indicate that q is true. Since aT = F, we 
then place an F under the negation symbol. Next, since F > r = T regardless of the truth value of r, 
we place a T under the conditional symbol. Finally, since ap V T = T regardless of the truth value of 
p, we place a T under the disjunction symbol. We made this last T bold to indicate that we are finished. 


(3) Knowing that q has truth value T is enough to determine the truth value of ap V (~q > r), as we 
saw in Note 1 above. It’s okay if you didn’t notice that right away. This kind of reasoning takes a bit of 
practice and experience. 


Truth table solution: An alternative solution is to build the whole truth table of =p V (nq > r) one 
column at a time. Since there are 3 propositional variables (p, q, and r), we will need 2? = 8 rows to 
get all the possible truth values. We then create a column for each compound statement that appears 
within the given statement starting with the statements of smallest length and working our way up to 
the given statement. We will need columns for p, q, r (the atomic statements), =p, ~q, ~q > r, and 
finally, the statement itself, ap V (~q > r). Below is the final truth table with the relevant row 
highlighted and the final answer circled. 


Page lalaa ag > ap Gage 
T/T|T{|F F/T T 
i a ie a (O) 
T| F|/T|F|T]| T T 
T|F|F|F|T F F 
F/T/T[Tl[F| T T 
FT FÍT F| T T 
F|F TÍTT T T 
F| F FÍT T| F T 


Notes: (1) We fill out the first three columns of the truth table by listing all possible combinations of 
truth assignments for the propositional variables p, q, and r. Notice how down the first column we 
have 4 T’s followed by 4 F’s, down the second column we alternate sequences of 2 T’s with 2 F’s, and 
down the third column we alternate T’s with F’s one at a time. This is a nice systematic way to make 
sure we get all possible combinations of truth assignments. 
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If you’re having trouble seeing the pattern of T’s and F’s, here is another way to think about it: In the 
first column, the first half of the rows have a T and the remainder have an F. This gives 4 T’s followed 
by 4 F’s. 


For the second column, we take half the number of consecutive T’s in the first column (half of 4 is 2) 
and then we alternate between 2 T’s and 2 F’s until we fill out the column. 


For the third column, we take half the number of consecutive T’s in the second column (half of 2 is 1) 
and then we alternate between 1 T and 1 F until we fill out the column. 


(2) Since the connective — has the effect of taking the opposite truth value, we generate the entries in 
the fourth column by taking the opposite of each truth value in the first column. Similarly, we generate 
the entries in the fifth column by taking the opposite of each truth value in the second column. 


(3) For the sixth column, we apply the connective > to the fifth and third columns, respectively, and 
finally, for the last column, we apply the connective V to the fourth and sixth columns, respectively. 


(4) The original question is asking us to compute the truth value of ap V (nq > r) when p and q are 
true, and r is false. In terms of the truth table, we are being asked for the entry in the second row and 
last (seventh) column. Therefore, the answer is T. 


(5) This is certainly not the most efficient way to answer the given question. However, building truth 
tables is not too difficult, and it’s a foolproof way to determine truth values of compound statements. 
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Problem Set 1 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 
LEVEL 1 


1. Determine whether each of the following sentences is an atomic statement, a compound 
statement, or not a statement at all: 


(1) I am not going to work today. 

(ii) What is the meaning of life? 

(iii) Don’t go away mad. 

(iv) I watched the television show Parks and Recreation. 
(v) If pigs have wings, then they can fly. 

(vi) 3<-5o0r38 > 37. 

(vii) This sentence has five words. 


(viii) I cannot swim, but I can run fast. 


2. What is the negation of each of the following statements: 
(1) The banana is my favorite fruit. 
(ii) 7>-3. 
(iii) You are not alone. 


(iv) The function f is differentiable everywhere. 


LEVEL 2 


3. Let p represent the statement “9 is a perfect square,” let q represent the statement “Orange is a 
primary color,” and let r represent the statement “A frog is a reptile.” Rewrite each of the 
following symbolic statements in words, and state the truth value of each statement: 


© p^q 
(ii) ar 
Gi) por 
(iv) qer 
(v)  —=pAq 


(vi) ~q) 
(vii) ap V nq 
(viii) QGAQ)>r 
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4. Consider the compound sentence “You can have a cookie or ice cream.” In English this would 
most likely mean that you can have one or the other but not both. The word “or” used here is 
generally called an “exclusive or” because it excludes the possibility of both. The disjunction is 
an “inclusive or.” Using the symbol @ for exclusive or, draw the truth table for this connective. 


LEVEL 3 


5. Let p, q, and r represent true statements. Compute the truth value of each of the following 
compound statements: 


© (@vaq)vr 
(ii) (pVq)Anr 
(iii) ~p > (qvr) 
(iv) alpeong)Ar 
(vy) alpa Gg > r)] 
vd a[(ap V nq) e ar] 
(vil) p> (4> ~r) 
(viii) =[ap > (4 > =r)] 
6. Using only the logical connectives 4, A, and V, produce a statement using the propositional 


variables p and q that has the same truth values as p @® q (this is the “exclusive or” defined in 
problem 4 above). 


LEVEL 4 


7. Let p represent a true statement. Decide if this is enough information to determine the truth value 
of each of the following statements. If so, state that truth value. 
© pvq 
Gi) pq 
(ili) ~p > alq V ar) 
üv) a(apAq) <p 
Vv) @eqgenp 
(vi) al(apA7q) e ~ar] 
(vii) [p ^ap) > p]A (p Vv ap) 


(viii) r> [nq > (Ap > ~—r)] 
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8. Assume that the given compound statement is true. Determine the truth value of each 
propositional variable. 


© p^q 

GD) ~ >q) 

Gii) pe Ria) 
(iv) [pA@vr)]Anr 


LEVEL 5 


9. Show that [p A (q Vr)] e [p ^q) V (pA7)] is always true. 


10. Show that [I~ Aq-or|-> s| > [( > r) > s] is always true. 
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LESSON 2 - SET THEORY 
SETS AND SUBSETS 


Describing Sets 


A set is simply a collection of “objects.” These objects can be numbers, letters, colors, animals, funny 
quotes, or just about anything else you can imagine. We will usually refer to the objects in a set as the 
members or elements of the set. 


If a set consists of a small number of elements, we can describe the set simply by listing the elements 
in the set in curly braces, separating elements by commas. 


Example 2.1: 
1. {apple, banana} is the set consisting of two elements: apple and banana. 


2. {anteater, elephant, egg, trapezoid} is the set consisting of four elements: anteater, elephant, 
egg, and trapezoid. 


3. {2,4, 6,8, 10} is the set consisting of five elements: 2, 4, 6, 8, and 10. The elements in this set 
happen to be numbers. 


A set is determined by its elements, and not the order in which the elements are presented. For 
example, the set {4, 2, 8, 6, 10} is the same as the set {2, 4, 6, 8, 10}. 


Also, the set {2, 2, 4, 6, 8, 10, 10, 10} is the same as the set {2, 4, 6, 8, 10}. If we are describing a set by 
listing its elements, the most natural way to do this is to list each element just once. 


We will usually name sets using capital letters such as A, B, and C. For example, we might write 
A = {1, 2,3}. So, A is the set consisting of the elements 1, 2, and 3. 


Example 2.2: Consider the sets A = {a, b}, B = {b,a}, C = {a,b, a}. Then A, B, and C all represent the 
same set. We can write A = B =C. 


We use the symbol € for the membership relation (we will define the term “relation” more carefully in 
Lesson 10). So, x E A means “x is an element of A,” whereas x ¢ A means “x is not an element of A.” 


Example 2.3: Let A = {a,k,3, EJ, @®}. Thena EA,k E A,3 EA, LJ EA, and PEA. 


If a set consists of many elements, we can use ellipses (...) to help describe the set. For example, the 
set consisting of the natural numbers between 17 and 5326, inclusive, can be written 
{17, 18, 19, ...,5325, 5326} (“inclusive” means that we include 17 and 5326). The ellipses between 19 
and 5325 are there to indicate that there are elements in the set that we are not explicitly mentioning. 


Ellipses can also be used to help describe infinite sets. The set of natural numbers can be written 
N = {0, 1, 2, 3,...}, and the set of integers can be written Z = {...,-4,-3,-2,-1,0,1, 2, 3,4,...}. 
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Example 2.4: The odd natural numbers can be written © = {1,3,5,...}. The even integers can be 
written 2Z = {...,-6,-4,-2, 0, 2,4, 6, ... }. The primes can be written P = {2,3,5, 7,11, 13,17,...}. 


A set can also be described by a certain property P that all its elements have in common. In this case, 
we can use the set-builder notation {x|P(x)} to describe the set. The expression {x|P(x)} can be read 
“the set of all x such that the property P(x) is true.” Note that the symbol “|” is read as “such that.” 


Example 2.5: Let’s look at a few different ways that we can describe the set {2, 4, 6,8, 10}. We have 
already seen that reordering and/or repeating elements does not change the set. For example, 
{2, 2, 6, 4, 10, 8} describes the same set. Here are a few more descriptions using set-builder notation: 


e {n|nisan even positive integer less than or equal to 10} 
e {n€Z|niseven,0<n< 10} 
© {2k|k =1,2,3,4,5} 


The first expression in the bulleted list can be read “the set of n such that n is an even positive integer 
less than or equal to 10.” The second expression can be read “the set of integers n such that n is even 
and n is between 0 and 10, including 10, but excluding 0. Note that the abbreviation “n € Z” can be 
read “n is in the set of integers,” or more succinctly, “n is an integer.” The third expression can be read 
“the set of 2k such that k is 1, 2, 3, 4, or 5.” 


If A is a finite set, we define the cardinality of A, written |A|, to be the number of elements of A. For 
example, |{a, b}| = 2. In Lesson 10, we will extend the notion of cardinality to also include infinite sets. 


Example 2.6: Let A = {anteater, egg, trapezoid}, B = {2,3,3}, and C = {17, 18, 19, ...,5325, 5326}. 
Then |A| = 3, |B| = 2, and |C| = 5310. 


nd 


Notes: (1) The set A has the three elements “anteater,” “egg,” and “trapezoid.” 
(2) The set B has just two elements: 2 and 3. Remember that {2, 3,3} = {2, 3}. 


(3) The number of consecutive integers from m to n, inclusive, is m— m + 1. For set C, we have 
m = 17 and n = 5326. Therefore, |C| = 5326 — 17 +1 = 5310. 


(4) | call the formula “n — m + 1” the fence-post formula. If you construct a 3-foot fence by placing a 
fence-post every foot, then the fence will consist of 4 fence-posts (3 — 0 +1 = 4). 


The empty set is the unique set with no elements. We use the symbol @ to denote the empty set (some 
authors use the symbol { } instead). 


Subsets 


For two sets A and B, we say that A is a subset of B, written A C B, if every element of A is an element 
of B. That is, A & B if, for every x, x E€ A implies x E€ B. Symbolically, we can write Vx(x E€ A> x E B). 
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Notes: (1) The symbol V is called a universal quantifier, and it is pronounced “For all.” 


(2) The logical expression Vx(x E€ A > x E B) can be translated into English as “For all x, if x is an 
element of A, then x is an element of B.” 


(3) To show that a set A is a subset of a set B, we need to show that the expression Vx(x E A > x E B) 
is true. If the set A is finite and the elements are listed, we can just check that each element of A is also 
an element of B. However, if the set A is described by a property, say A = {x|P(x)}, we may need to 
craft an argument more carefully. We can begin by taking an arbitrary but specific element a from A 
and then arguing that this element a is in B. 


What could we possibly mean by an arbitrary but specific element? Aren’t the words “arbitrary” and 
“specific” antonyms? Well, by arbitrary, we mean that we don’t know which element we are choosing 
— it’s just some element a that satisfies the property P. So, we are just assuming that P(q) is true. 
However, once we choose this element a, we use this same a for the rest of the argument, and that is 
what we mean by it being specific. 


(4) To the right we see a physical representation of A S B. This 
figure is called a Venn diagram. These types of diagrams are very 
useful to help visualize relationships among sets. Notice that set A 
lies completely inside set B. We assume that all the elements of A 
and B lie in some universal set U. 


As an example, let’s let U be the set of all species of animals. If we 
let A be the set of species of cats and we let B be the set of species 
of mammals, then we have A © B GC U, and we see that the Venn 
diagram to the right gives a visual representation of this situation. 
(Note that every cat isa mammal and every mammal is an animal.) 


Let’s try to prove our first theorem using the definition of a subset together with Note 3 above about 
arbitrary but specific elements. 


Theorem 2.1: Every set A is a subset of itself. 


Before writing the proof, let’s think about our strategy. We want to prove A C A. In other words, we 
want to show Vx(x E A > x E A). So, we will take an arbitrary but specific a E A and then argue that 
a € A. But that’s pretty obvious, isn’t it? In this case, the property describing the set is precisely the 
conclusion we are looking for. Here are the details. 


Proof of Theorem 2.1: Let A be a set and leta E A. Thena E A.So,a E A > a E Ais true. Since a was 
an arbitrary element of A, Vx(x E A > x E A) is true. Therefore, A € A. oO 


Notes: (1) The proof begins with the opening statement “Let A be a set and let a € A.” In general, the 
opening statement states what is given in the problem and/or fixes any arbitrary but specific objects 
that we will need. 


(2) The proof ends with the closing statement “Therefore, A © A.” In general, the closing statement 
states the result. 
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(3) Everything between the opening statement and the closing statement is known as the argument. 
(4) We place the symbol O at the end of the proof to indicate that the proof is complete. 


(5) Consider the logical statement p > p. This statement is always true (T > T =T and F > F = T). 
p > p is an example of a tautology. A tautology is a statement that is true for every possible truth 
assignment of the propositional variables (see Problems 9 and 10 from Lesson 1 for more examples). 


(6) If we let p represent the statement a E A, by Note 5, we see thata € A > a E A is always true. 


Alternate proof of Theorem 2.1: Let A be a set and leta E A. Since p > p is a tautology, we have that 
a E A > a E Ais true. Since a was arbitrary, Vx(x E A > x E A) is true. Therefore, A € A. o 


Let’s prove another basic but important theorem. 
Theorem 2.2: The empty set is a subset of every set. 


Analysis: This time we want to prove Ø & A. In other words, we want to show Vx(x E Ø > x E A). 
Since x E€ Ø is always false (the empty set has no elements), x E Ø > x E A is always true. 


In general, if p is a false statement, then we say that p > q is vacuously true. 


Proof of Theorem 2.2: Let A be a set. The statement x E€ Ø > x E A is vacuously true for any x, and 
so, Vx(x E Ø > x E A) is true. Therefore, Ø © A. Oo 


Note: The opening statement is “Let A be a set,” the closing statement is “Therefore, Ø © A,” and the 
argument is everything in between. 


Example 2.7: Let C = {a,b,c}, D = {a,c}, E = {b,c}, F = {b, d}, and G = Ø. Then D E CandE CC. 
Also, since the empty set is a subset of every set, we have G ©C,GCD,GCE,GECF,andG CG. 
Every set is a subset of itself, and so,C E C,D GS D,E CE,andF CF. 


Note: Below are possible Venn diagrams for this problem. The diagram on the left shows the 
relationship between the sets C, D, E, and F. Notice how D and E are both subsets of C, whereas F is 
not a subset of C. Also, notice how D and E overlap, E and F overlap, but there is no overlap between 
D and F (they have no elements in common). The diagram on the right shows the proper placement of 
the elements. Here, I chose the universal set to be U = {a, b,c, d,e, f, g}. This choice for the universal 
set is somewhat arbitrary. Any set containing {a, b, c, d} would do. 
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Example 2.8: The set A = {a, b} has 2 elements and 4 subsets. The subsets of A are Ø, {a}, {b}, and 
{a, b}. 


The set B = {a, b,c} has 3 elements and 8 subsets. The subsets of B are @, {a}, {b}, {c}, {a, b}, {a, c}, 
{b,c}, and {a,b,c}. 


Let’s draw a tree diagram for the subsets of each of the sets A and B. 


{a, b} {a,b,c} 
is | 
n P ua wg Nie 
7 
Ø {a} {b} {c} 
! Fi 


The tree diagram on the left is for the subsets of the set A = {a, b}. We start by writing the set 
A = {a, b} at the top. On the next line we write the subsets of cardinality 1 ({a} and {b}). On the line 
below that we write the subsets of cardinality 0 (just Ø). We draw a line segment between any two sets 
when the smaller (lower) set is a subset of the larger (higher) set. So, we see that Ø E {a}, Ø € {b}, 
{a} © {a, b}, and {b} © {a, b}. There is actually one more subset relationship, namely Ø © {a, b} (and 
of course each set displayed is a subset of itself). We didn’t draw a line segment from Ø to {a, b} to 
avoid unnecessary clutter. Instead, we can simply trace the path from Ø to {a} to {a, b} (or from Ø to 
{b} to {a, b}). We are using a property called transitivity here (see Theorem 2.3 below). 


The tree diagram on the right is for the subsets of B = {a, b, c}. Observe that from top to bottom we 
write the subsets of B of size 3, then 2, then 1, and then 0. We then draw the appropriate line 
segments, just as we did for A = {a, b}. 


How many subsets does a set of cardinality n have? Let’s start by looking at some examples. 


Example 2.9: A set with 0 elements must be Ø, and this set has exactly 1 subset (the only subset of the 
empty set is the empty set itself). 


A set with 1 element has 2 subsets, namely Ø and the set itself. 


In the last example, we saw that a set with 2 elements has 4 subsets, and we also saw that a set with 
3 elements has 8 subsets. 


Do you see the pattern yet? 1 = 2°, 2 = 21, 4 = 27, 8 = 23. So, we see that a set with 0 elements has 
2° subsets, a set with 1 element has 2! subsets, a set with 2 elements has 2? subsets, and a set with 3 
elements has 2? subsets. A reasonable guess would be that a set with n elements has 2” subsets. You 
will be asked to prove this result later (Problem 12 in Lesson 4). We can also say that if |A| = n, then 
|P(A)| = 2”, where P(A) (pronounced the power set of A) is the set of all subsets of A. In set-builder 
notation, we write P(A) = {B | B © A}. 


Let’s get back to the transitivity mentioned above in our discussion of tree diagrams. 
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Theorem 2.3: Let A, B, and C be sets such that A © B and B S C.ThenA EC. 


Proof: Suppose that A, B, and C are sets with A © B and B E C,andleta E A. Since A © Banda E A, 
it follows that a E€ B. Since B © C anda E B, it follows that a E C. Since a was an arbitrary element 
of A, we have shown that every element of A is an element of C. That is, Vx(x E€ A > x € C) is true. 
Therefore, A © C. oO 


Note: To the right we have a Venn diagram illustrating Theorem 
2-3; 


Theorem 2.3 tells us that the relation & is transitive. Since & is 
transitive, we can write things like A © B © C & D, and without 
explicitly saying it, we know that A E C,A © D,andB CD. 


Example 2.10: The membership relation € is an example of a C U 
relation that is not transitive. For example, let A = {0}, 


B = {0,1,{0}}, and C = fx, y,{0, 1, {0}}}. Observe that A € B ACBCC 


andB EC, butA EC. 
Oar} 


(2) The set B has 3 elements, namely 0, 1, and {0}. But wait! A = {0}. So, A E B. The set A is circled 
twice in the above image. 


Notes: (1) The set A has just 1 element, namely 0. 


(3) The set C also has 3 elements, namely, x, y, and {0,1, {O}}. But wait! B = {0, 1, {0}}. So, B € C. The 
set B has a rectangle around it twice in the above image. 


(4) Since A + x, A + y, and A + {0, i, {O}}, we see that A ¢ C. 
(5) Is it clear that {0} ¢ C? {0} is in a set that’s in C (namely, B), but {0} is not itself in C. 


(6) Here is a more basic example showing that € is not transitive: Ø E {Ø} € {{Ø}, but Ø ¢ {{O}} 
The only element of HoR is {Ø}. 


Unions and Intersections 


The union of the sets A and B, written A U B, is the set of elements that are in A or B (or both). 


AUB={x|x €Aorx € B} 


The intersection of A and B, written A N B, is the set of elements that are simultaneously in A and B. 


ANB={x|xe€Aandx EB} 


The following Venn diagrams for the union and intersection of two sets can be useful for visualizing 
these operations. 
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S 


AUB ANB 


Example 2.11: 


1. Let A = {0,1,2,3,4} and B = {3,4,5, 6}. Then A U B = {0,1, 2,3,4,5,6} and AN B = {3, 4}. 
See the figure below for a visual representation of A,B, AU Band ANB. 


U 


2. Recall that the set of natural numbers is N = {0,1,2,3,...} and the set of integers is 
Z = {...,-4,-3,-2,-1,0, 1, 2,3,4,...}. Observe that in this case, we have N CZ. Also, 
NUZ=ZandNnZE=N. 


In fact, whenever A and B are sets and B © A, then A U B = A and A N B = B. We will prove 
the first of these two facts in Theorem 2.5. You will be asked to prove the second of these facts 
in Problem 13 below. 


3. Let E = {0, 2, 4, 6, ... } be the set of even natural numbers and let O = {1,3, 5,7, ... } be the set 
of odd natural numbers. Then E U O = {0, 1, 2, 3, 4,5, 6,7,...} = Nand E N O = Ø. In general, 
we say that sets A and B are disjoint or mutually exclusive if A N B = Ø. Below is a Venn 
diagram for disjoint sets. 


ANB=9 


Let’s prove some theorems involving unions of sets. You will be asked to prove the analogous results 
for intersections of sets in Problems 11 and 13 below. 


Theorem 2.4: If A and B are sets, then A G AUB. 


Before going through the proof, look once more at the Venn diagram above for A U B and convince 
yourself that this theorem should be true. 


Proof of Theorem 2.4: Suppose that A and B are sets and let x E A. Then x E A or x E B. Therefore, 
x E AUB. Since x was an arbitrary element of A, we have shown that every element of A is an element 
of A U B. That is, Vx(x E€ A > x E AUB) is true. Therefore, A © AUB. Oo 


Note: Recall from Lesson 1 that if p is a true statement, then p V q (p or q) is true no matter what the 
truth value of q is. In the second sentence of the proof above, we are using this fact with p being the 
statement x E A and q being the statement x E B. 


We will use this same reasoning in the second paragraph of the next proof as well. 
Theorem 2.5: B € A if and only if AU B =A. 


Before going through the proof, it’s a good idea to draw a Venn diagram for B € A and convince 
yourself that this theorem should be true. 


Technical note: Let X and Y be sets. The Axiom of Extensionality says that X and Y are the same set if 
and only if X and Y have precisely the same elements. In symbols, we have 

X =Y ifandonlyifVx(x EX Ox EY). 
It is easy to verify that p © q is logically equivalent to (p > q) A (q > p). To see this, we check that 


all possible truth assignments for p and q lead to the same truth value for the two statements. For 
example, if p and q are both true, then 


peq=TeT=T and (poe qgA(q>p)=(TOT)A(T OT) =TATE=T. 


The reader should check the other three truth assignments for p and q, or draw the entire truth table 
for both statements. 


Letting p be the statement x E X, letting q be the statement x € Y, and replacing p © q by the logically 
equivalent statement (p > q) A (q > p) gives us 


X =Y ifand only if vx((x EX >x EY)A(x EY >x€X)). 


It is also true that vx(p(x) A q(x)) is logically equivalent to vx(p(x)) A vx(q(x)). And so, we have 
X =Y if and only if Vx(x E X > x EY) and Yx(xEY >x€EX). 


In other words, to show that X = Y, we can instead show that X © Y and Y CX. 
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Proof of Theorem 2.5: Suppose that B © A and let x E A U B. Then x E A or x E B. If x E A, then 
x E A (trivially). If x E B, then since B C A, it follows that x € A. Since x was an arbitrary element of 
A U B, we have shown that every element of A U B is an element of A. That is, Vx(x EAU B > x EA) 
is true. Therefore, A U B C A. By Theorem 2.4, A © AU B. Since A U B GC Aand A CA UB, it follows 
that AUB =A. 


Now, suppose that A U B = A and let x E B. Since x E B, it follows that x € A or x E B. Therefore, 
x E AUB. Since A U B = A, we have x E A. Since x was an arbitrary element of B, we have shown 
that every element of B is an element of A. That is, Vx(x € B > x € A). Therefore, B C A. oO 
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Problem Set 2 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 
1. Determine whether each of the following statements is true or false: 
G) 2 € {2} 
Gi) SE@ 


(iii) @€ {1,2} 

(iv) a€ Íb, {a}} 

(vy) @€ {1,2} 

(vi) {A} {5,4} 

(vii) {a,b,c} & {a,b,c} 

(viii) {1,a,{2,b}} S {1,a, 2, b} 


2. Determine the cardinality of each of the following sets: 
(i) {a, b,c, d,e, f} 
Gi) = {1,2, 3, 2, 1} 
(iii) {1,2,...,53} 
(iv)  {5,6,7,...,2076, 2077} 


3. Let A = {a,b,A,6} and B = {b,c,6,y}. Determine each of the following: 


(i) AUB 
(ii) ANB 
LEVEL 2 
4. Determine whether each of the following statements is true or false: 
(i) GEO 
(ii) @€ {Q} 


(iii) {O}EO 
(iv) {Ø} € {O} 
(v) ~@CO@ 
(vi) @& {9} 
(vii) {CO 
(viii) {Ø} S {Ø} 
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5. Determine the cardinality of each of the following sets: 


© (0, (1,2, 33} 

(ii) ffo, (0}}}t 

(iii) {{1,2}, Ø, {9}, {Ø, (9,1, 23}} 
(iv) foto HO, {0,10 oD) 


6. Let P = {Ø, {Ø} and Q = {{0}, {9, w} Determine each of the following: 


(1) PUQ 
(ii) PNQ 
LEVEL 3 


7. How many subsets does {a, b, c, d} have? Draw a tree diagram for the subsets of {a, b, c,d}. 


8. A set A is transitive if Vx(x E A > x © A) (in words, every element of A is also a subset of A). 
Determine if each of the following sets is transitive: 


ji) @ 

(ii) {0} 

(iii) Ho) 

(iv) fØ, {0}, {t03}} 


LEVEL 4 


9. A relation R is reflexive if Vx(xRx) and symmetric if VxVy(xRy > yRx). Show that © is 
reflexive, but E is not. Then decide if each of € and € is symmetric. 


10. Let A, B, C, D, and E be sets such that A E B, B © C,C © D, and D CE. Prove that A € E. 


11. Let A and B be sets. Prove that AN BC A. 


LEVEL 5 
12. Let P(x) be the property x ¢ x. Prove that {x|P(x)} cannot be a set. 
13. Prove that B € A if and only if A N B = B. 
14. Let A = {a,b,c,d}, B = {X |X E A ^d ¢ X}, andC = {X | X CAAA E X}. Show that there is 


a natural one-to-one correspondence between the elements of B and the elements of C. Then 
generalize this result to a set with n + 1 elements for n > 0. 
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LESSON 3 - ABSTRACT ALGEBRA 
SEMIGROUPS, MONOIDS, AND GROUPS 


Binary Operations and Closure 


A binary operation on a set is a rule that combines two elements of the set to produce another element 
of the set. 


Example 3.1: Let S = {0,1}. Multiplication on S is a binary operation, whereas addition on S is not a 
binary operation (here we are thinking of multiplication and addition in the “usual” sense, meaning the 
way we would think of them in elementary school or middle school). 


To see that multiplication is a binary operation on S, observe that 0-0 = 0,0-1=0,1-0=0, and 
1-1 = 1. Each of the four computations produces 0 or 1, both of which are in the set S. 


To see that addition is not a binary operation on S, just note that 1 + 1 = 2, and 2 ¢ S. 


Let’s get a bit more technical and write down the formal definition of a binary operation. The 
terminology and notation used in this definition will be clarified in the notes below and formalized 
more rigorously later in Lesson 10. 


Formally, a binary operation x on a set S is a function x: S x S > S. So, if a,b E€ S, then we have 
x (a,b) E S. For easier readability, we will usually write * (a,b) as a * b. 


Notes: (1) If A and B are sets, then A X B is called the Cartesian product of A and B. It consists of the 
ordered pairs (a,b), where a E A and b E B. A function f: A x B —> C takes each such pair (a, b) to 
an element f (a,b) E C. 


As an example, let A = {dog, fish}, B = {cat, snake}, C = {0, 2, 4, 6, 8}, and define f: A x B > C by 
f(a,b) = the total number of legs that animals a and b have. Then we have f(dog, cat) = 8, 
f (dog, snake) = 4, f (fish, cat) = 4, f (fish, snake) = 0. 

We will look at ordered pairs, cartesian products, and functions in more detail in Lesson 10. 

(2) For a binary operation, all three sets A, B, and C in the expression f: A x B > C are the same. 

As we saw in Example 3.1 above, if we let S = {0, 1}, and we let x be multiplication, then * is a binary 
operation on S. Using function notation, we have « (0,0) =0, * (0,1) =0, * (1,0) =0, and 
OT) = i, 


As stated in the formal definition of a binary operation above, we will usually write the computations 
as0*0=0,0*1=0,1*0=0,and1*1=1. 


We can use symbols other than x for binary operations. For example, if the operation is multiplication, 


we would usually use a dot (-) for the operation as we did in Example 3.1 above. Similarly, for addition 
we would usually use +, for subtraction we would usually use —, and so on. 
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Recall: N = {0, 1, 2, 3, ... } is the set of natural numbers and Z = {...,-4,-3,-2,-1,0, 1, 2, 3,4,...} is 
the set of integers. 


If A is a set of numbers, we let A* be the subset of A consisting of just the positive numbers from A. 
For example, Zt = {1, 2, 3,4,...}, and in fact, Nt = Zt. 


Example 3.2: 


1. 


The operation of addition on the set of natural numbers is a binary operation because whenever 
we add two natural numbers we get another natural number. Here, the set S is N and the 
operation * is +. Observe that if a E€ N and b E N, then a + b EN. For example, if a = 1 and 
b = 2 (both elements of N), thena+b=1+2=3,and3 EN. 


The operation of multiplication on the set of positive integers is a binary operation because 
whenever we multiply two positive integers we get another positive integer. Here, the set S is 
Z* and the operation x is -. Observe that if a € Z* and b E€ Z*, thena- b E€ Z*. For example, if 
a = 3 and b = 5 (both elements of Z*), thena:-b=3-5=15,and15€Z?*. 


Let S = Zand define x by a x b = minf{a, b}, where minf{a, b} is the smallest of a or b. Then * 
is a binary operation on Z. For example, if a =-5 and b = 3 (both elements of Z), then 
axb=-5,and-5€Z. 


Subtraction on the set of natural numbers is not a binary operation. To see this, we just need 
to provide a single counterexample. (A counterexample is an example that is used to prove that 
a statement is false.) If we let a= 1 and b = 2 (both elements of N), then we see that 
a—b=1-— 2 isnot an element of N. 


Let S = {u, v, w} and define x using the following table: 


€E S £&|* 
e S$ se 
e £ Ss 
S £ SJS 


The table given above is called a multiplication table. For a, b € S, we evaluate a x b by taking 
the entry in the row given by a and the column given by b. For example, v * w = u. 


€E S &|* 
e $ sje 
e £ Zjs 
stat 


x is a binary operation on S because the only possible “outputs” are u, v, and w. 


Some authors refer to a binary operation * on a set S even when the binary operation is not defined 
on all pairs of elements a,b E€ S. We will always refer to these “false operations” as partial binary 
operations. 


We say that the set S is closed under the partial binary operation x if whenever a,b E€ S, we have 
axbes. 


31 


In Example 3.2, part 4 above, we saw that subtraction is a partial binary operation on N that is not a 
binary operation. In other words, N is not closed under subtraction. 


Semigroups and Associativity 


Let x be a binary operation on a set S. We say that » is associative in S if for all x, y, z in S, we have 
(xe y)*#z=x* (xz) 
A semigroup is a pair (S,*), where S is a set and * is an associative binary operation on S. 


Example 3.3: 


1. (N,+), (Z, +), (N, -), and (Z, -) are all semigroups. In other words, the operations of addition 
and multiplication are both associative in N and Z. 


2. Let S = Z and define x by a x b = minf{a, b}, where min{a, b} is the smallest of a or b. Let’s 
check that x is associative in Z. Let a, b, and c be elements of Z. There are actually 6 cases to 
consider (see Note 1 below). Let’s go through one of these cases in detail. If we assume that 
a <b < c, then we have 


(ax b)xc =min{a,b}*c=ax*c=min{a,c}=a. 

ax(b*c) =axmin{b,c} =a*b = min{a, b} = a. 
Since both (ax*b)*c=a andax(bxc) =a, we have (axb)*c=ax*(bxc). After 
checking the other 5 cases, we can say the following: Since a, b, and c were arbitrary elements 
from Z, we have shown that x is associative in Z. It follows that (Z,*) is a semigroup. 


3. Subtraction is not associative in Z. To see this, we just need to provide a single counterexample. 
lf we let a=1, b=2, and c=3, then (a—b)-—c=(1-—2)-—3=-1-—3 =-4 and 
a-—(b-—c)=1-(2-3)=1-(¢1)=1+1=2. Since -4#2, subtraction is not 
associative in Z. It follows that (Z, —) is not a semigroup. 


Note that (N, —) is also not a semigroup, but for a different reason. Subtraction is not even a 
binary operation on N (see part 4 in Example 3.2). 


4. Let S = {u,v,w} and define x using the following table (this is the same table from part 5 in 
Example 3.2): 


v 
Ww 
u 


€S S £&|* 
£ E Lee 
e £ SJS 


v 
Notice that (u xv)xw =wxw =vandu*(vxw)=uxu=v. 


So, (uxv)*xw =ux(vx*w). However, this single computation does not show that x is 
associative in S. In fact, we have the following counterexample: (u*w)*v =w*xv=vand 
ux (w xv) =uxv = w. Thus, (ux*w)* v #ux (wv). 


So, * is not associative in S, and therefore, (S,x) is not a semigroup. 
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5. Let 2Z = {...,-6,-4,-2,0, 2, 4,6,... } be the set of even integers. When we multiply two even 
integers together, we get another even integer (we will prove this in Lesson 4). It follows that 
multiplication is a binary operation on 2Z. Since multiplication is associative in Z and 2Z C Z, 
it follows that multiplication is associative in 2Z (see Note 2 below). So, (2Z, -) is a semigroup. 

Notes: (1) In part 2 above, we must prove the result for each of the following 6 cases: 
a<b<c a<c<b b<a<c b<c<a c<a<b c<b<a 
The same basic argument can be used for all these cases. For example, we saw in the solution above 
that for the first case we get 
(ax bj) ec = min{a,b}*c =axc = mini gc} =a. 


ax (bxc) = ax min{b,c}=ax b = min{a,b} = a. 


Let’s also do the last case c < b < a: 


(ax b)xc = minfa, b}x*c = bxc = min{b,c} =c. 
ax (bxc) = ax min{b,c} = ax c = minf{a,c} = c. 


The reader should verify the other 4 cases to complete the proof. 


(2) Associativity is closed downwards. By this, we mean that if x is associative in a set A, and B C A, 
(B is a subset of A) then x is associative in B. 


The reason for this is that the definition of associativity involves only a universal statement—a 
statement that describes a property that is true for all elements without mentioning the existence of 
any new elements. A universal statement begins with the quantifier V (“For all” or “Every”) and never 
includes the quantifier 3 (“There exists” or “There is”). 


As a simple example, if every object in set A is a fruit, and B € A, then every object in B is a fruit. The 
universal statement we are referring to might be vx(P(x)), where P(x) is the property “x is a fruit.” 


In the case of associativity, the universal statement is vxvywz((x xy)xz=xx(yx* z)). 


Let x be a binary operation on a set S. We say that * is commutative (or Abelian) in S if for all x, y in 
S,wehavex*y =y*x. 


Example 3.4: 


1. (N,+), (Z,+), (N, -), and (Z, -) are all commutative semigroups. In other words, the 
operations of addition and multiplication are both commutative in N and Z (in addition to being 
associative). 


2. The semigroup (Z,*), where * is defined by a x b = min{a, b} is a commutative semigroup. 
Let’s check that * is commutative in Z. Let a and b be elements of Z. This time there are just 2 
cases to consider (a < b and b < a). Let’s do the first case in detail, and assume that a < b. 
We then have ax b = min{a,b}=a and b xa = min{b,a}=a. So, axb =b xa. After 
verifying the other case (which you should do), we can say that x is commutative in Z. 
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3. Define the binary operation * on N by ax b =a. Then (N,*) is a semigroup that is not 
commutative. For associativity, we have (ax b)*c =axc=aandax(bx*c)=axb=a. 
Let’s use a counterexample to show that x is not commutative. Well, 2 x 5 = 2and5*2=5. 


Note: In part 3 above, the computation a * (b x c) can actually be done in 1 step instead of 2. The way 
we did it above was to first compute b*c=b, and then to replace bxc with b to get 
ax(bxc)=ax*b =a. However, the definition of x says that a x (anything) = a. In this case, the 
“anything” is b x c. So, we have a x (b x c) = a just by appealing to the definition of x. 


Monoids and Identity 


Let (S,*) be a semigroup. An element e of S is called an identity with respect to the binary operation 
x if foralla E S,we have exa=axe=a 


A monoid is a semigroup with an identity. 


Example 3.5: 


1. (N, +) and (Z, +) are commutative monoids with identity 0 (when we add 0 to any integer a, 
we get a). (N, -) and (Z, -) are commutative monoids with identity 1 (when we multiply any 
integer a by 1, we get a). 


2. The commutative semigroup (Z,*), where x is defined by a x b = min{a, b} is not a monoid. 
To see this, leta € Z. Thena + 1 E€ Zanda*(a+1)=a#a+1. This shows that a is not an 
identity. Since a was an arbitrary element of Z, we showed that there is no identity. It follows 
that (Z,*) is not a monoid. 


3. The noncommutative semigroup (N,*), where a x b = a is also not a monoid. Use the same 
argument given in 2 above with Z replaced by N. 


4. (2Z, -) is another example of a semigroup that is not a monoid. The identity element of (Z, -) 
is 1, and this element is missing from (2Z, -). 


Groups and Inverses 


Let (M,x) be a monoid with identity e. An element a of M is called invertible if there is an element 
b€Msuchthataxb=bxa=e. 


A group is a monoid in which every element is invertible. 


Groups appear so often in mathematics that it’s worth taking the time to explicitly spell out the full 
definition of a group. 
A group is a pair (G,x) consisting of a set G together with a binary operation x satisfying: 

(1) (Associativity) For all x,y,z €G,(x*y)*z=xx(y*z). 

(2) (Identity) There exists an element e E G such that forallx E Gie xx =xxe =x. 

(3) (Inverse) For each x E G, there is y E G such that x xy = yxx =e. 


Notes: (1) If y € G is an inverse of x € G, we will usually write y = x7t. 
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(2) Recall that the definition of a binary operation already implies closure. However, many books on 
groups will mention this property explicitly: 


(Closure) For allx,yEG,x*yeEG. 
(3) A group is commutative or Abelian if for all x,y EG,x*y=yrx. 


Example 3.6: 
1. (Z,+) is a commutative group with identity 0. The inverse of any integer a is the integer - a. 


2. (N, +) isa commutative monoid that is not a group. For example, the natural number 1 has no 
inverse in N. In other words, the equation x + 1 = 0 has no solution in N. 


3. (Z, -) isa commutative monoid that is not a group. For example, the integer 2 has no inverse 
in Z. In other words, the equation 2x = 1 has no solution in Z. 


4. Arational number is a number of the form z where a and b are integers and b + 0. 


: f : 3 
We identify rational numbers and < whenever ad = bc. For example, - and : represent the 
same rational number because 1-6 = 6and2-3=6. 


We denote the set of rational numbers by Q. So, we have Q = {£ | a,b E€ Z,b + o). In words, 
Q is “the set of quotients a over b such that a and b are integers and b is not zero.” 


We identify the rational number = with the integer a. In this way, we have Z € Q. 


r $ a c a-d+b-c 
We add two rational numbers using the rule 5 + I pa 
ait+b0 a 


bi b 


0 a 0-b+1-a a 
and-+—= = 
1 b 


tb b 


Note that 0 = is an identity for (Q, +) because > + 2 = 


You will be asked to show in Problem 11 below that (Q, +) is a commutative group. 


5. We multiply two rational numbers using the rule z : < = = 
1. i f a 1 a1 a 1a 1-a a 
Note that 1 = gisan identity for (Q, -) because 7 a zand L b ib. b 


Now, 0: = Za . = Q. In particular, when we multiply 0 by any rational number, we 
can never get 1. So, 0 is a rational number with no multiplicative inverse. It follows that (Q, -) 


is not a group. 


However, 0 is the only rational number without a multiplicative inverse. In fact, you will be 
asked to show in Problem 9 below that (Q*, -) is a commutative group, where Q* is the set of 
rational numbers with 0 removed. 


Note: When multiplying two numbers, we sometimes drop the dot (-) for easier readability. So, we may 
write x - y as xy. We may also use parentheses instead of the dot. For example, we might write : = as 


(5) (£), whereas we would probably write — as F We may even use this simplified notation for 


arbitrary group operations. So, we could write a x b as ab. However, we will avoid doing this if it would 
lead to confusion. For example, we will not write a + b as ab. 
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Problem Set 3 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 
LEVEL 1 


1. For each of the following multiplication tables defined on the set S = {a, b}, determine if each 
of the following is true or false: 


(i) x defines a binary operation on S. 
(ii) xis commutative in S. 
(iii) ais an identity with respect to x. 


(iv) b is an identity with respect to x. 


I I Ill 
x | a b x | a b x | a b DUE E. b 
a a a a a b a a b a a a 
b a a b c a b b a b b b 


2. Show that there are exactly two monoids on the set S = {e, a}, where e is the identity. Which of 
these monoids are groups? Which of these monoids are commutative? 


LEVEL 2 


3. Let G = {e,a, b} and let (G,x) be a group with identity element e. Draw a multiplication table 
for (G,*). 


4. Prove that in any monoid (M,*), the identity element is unique. 


LEVEL 3 


5. Assume that a group (G,*) of order 4 exists with G = {e,a,b,c}, where e is the identity, 
a? = band b? = e. Construct the table for the operation of such a group. 


6. Prove that in any group (G,x), each element has a unique inverse. 
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LEVEL 4 


7. Let (G,*) be a group with a, b € G, and let a~* and b™t be the inverses of a and b, respectively. 
Prove 


G) (axb)™t = bt x a™t. 


(ii) the inverse of a7? is a. 
8. Let (G,*) be a group such that a? = e for all a E G. Prove that (G,*) is commutative. 


9. Prove that (Q*, -) is a commutative group. 


LEVEL 5 


10. Prove that there are exactly two groups of order 4, up to renaming the elements. 
11. Show that (Q, +) is a commutative group. 


12. Let S = {a, b}, where a + b. How many binary operations are there on S? How many semigroups 
are there of the form (S,*), up to renaming the elements? 
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LESSON 4 - NUMBER THEORY 
THE RING OF INTEGERS 


Rings and Distributivity 


Before giving the general definition of a ring, let’s look at an important example. 


Example 4.1: Recall that Z = {...,-4,-3,-2,-1,0,1, 2, 3,4,...} is the set of integers. Let’s go over 
some of the properties of addition and multiplication on this set. 


1. Z is closed under addition. In other words, whenever we add two integers, we get another 
integer. For example, 2 and 3 are integers, and we have 2 + 3 = 5, which is also an integer. As 
another example, - 8 and 6 are integers, and so is - 8 + 6 = -2. 


2. Addition is commutative in Z. In other words, when we add two integers, it does not matter 
which one comes first. For example, 2 + 3 = 5 and 3 + 2 = 5. So, we see that 2 +3 = 3 +2. 
As another example, - 8 + 6 = - 2 and 6 + (- 8) = - 2. So, we see that -8 + 6 = 6 + (- 8). 


3. Addition is associative in Z. In other words, when we add three integers, it doesn’t matter if we 
begin by adding the first two or the last two integers. For example, (2+3)+4=5+4=9 
and 2+(3+4)=2+7=49. So, (2 +3) +4 = 2 + (3 + 4). As another example, we have 
(-8+6)+(-5)=-2+(-5)=-7 and -8+ (6+ (-5)) =-8+1=-7. So, we see that 
(-8+ 6) +(-5) =-8+(6+(-5)). 


4. Zhas an identity for addition, namely 0. Whenever we add 0 to another integer, the result is 
that same integer. For example, we have 0+ 3 =3 and 3+0=3. As another example, 
0+ (-5) =-5and(-5)+0=-5. 


5. Every integer has an additive inverse. This is an integer that we add to the original integer to 
get 0 (the additive identity). For example, the additive inverse of 5 is -5 because we have 
5+ (-5) = 0and-5+5 = 0. Notice that the same two equations also show that the inverse 
of -5 is 5. We can say that 5 and - 5 are additive inverses of each other. 


We can summarize the five properties above by saying that (Z, +) is a commutative group. 


6. Z is closed under multiplication. In other words, whenever we multiply two integers, we get 
another integer. For example, 2 and 3 are integers, and we have 2 - 3 = 6, which is also an 
integer. As another example, - 3 and - 4 are integers, and so is (- 3)(- 4) = 12. 


7. Multiplication is commutative in Z. In other words, when we multiply two integers, it does not 
matter which one comes first. For example, 2-3 = 6and3-2 = 6.So0,2-3 = 3- 2. As another 
example, -8 - 6 = - 48 and 6(- 8) = - 48. So, we see that - 8 - 6 = 6(- 8). 

8. Multiplication is associative in Z. In other words, when we multiply three integers, it doesn’t 
matter if we begin by multiplying the first two or the last two integers. For example, 
(2:3):-4=6-4= 24 and 2. (3-4) =2-12 = 24. So, (2-3)-4 = 2- (3-4). As another 
example, (- 5-2) - (- 6) = —10- (-6) = 60 and -5- (2: (-6)) = -5 - (- 12) = 60. So, we 
see that (-5 - 2) - (-6) =-5-(2-(-6)). 
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9. Zhas an identity for multiplication, namely 1. Whenever we multiply 1 by another integer, the 
result is that same integer. For example, we have 1-3 = 3 and 3-1 = 3. As another example 
1-(-5) =-5and(-5)-1=-5. 


We can summarize the four properties above by saying that (Z, -) isa commutative monoid. 

10. Multiplication is distributive over addition in Z. This means that whenever k, m, and n are 
integers, we have k-(m+n)=k-m+k-n. For example, 4-(2+1)=4-3=12 and 
4:-2+4-1=8+4=12. So, 4-(2+1)=4-2+4-1. As another example, we have 
-2- ((-D + 3) =-2(2) =-4 and -2-(-1) + (-2)-3 =2-—6=-4. Therefore, we see 
that -2-((-1) +3) =-2-(-1) + (-2):3. 


Notes: (1) Since the properties listed in 1 through 10 above are satisfied, we say that (Z, +, -) is a ring. 
We will give the formal definition of a ring below. 


(2) Observe that a ring consists of (i) a set (in this case Z), and (ii) two binary operations on the set 
called addition and multiplication. 


(3) (Z, +) is a commutative group and (Z, -) isa commutative monoid. The distributive property is the 
only property mentioned that requires both addition and multiplication. 


(4) We see that Z is missing one nice property—the inverse property for multiplication. For example, 2 
has no multiplicative inverse in Z. There is no integer n such that 2-n = 1. So, the linear equation 
2n — 1 = 0 has no solution in Z. 


(5) If we replace Z by the set of natural numbers N = {0, 1, 2,... }, then all the properties mentioned 
above are satisfied except property 5—the inverse property for addition. For example, 1 has no 
additive inverse in N. There is no natural number n such that n + 1 = 0. 


(6) Z actually satisfies two distributive properties. Left distributivity says that whenever k, m, andn 
are integers, we have k- (m+n) =k-m+k-n. Right distributivity says that whenever k, m, and n 
are integers, we have (nm+n)-k=m-k+n-k. Since multiplication is commutative in Z, left 
distributivity and right distributivity are equivalent. 


(7) Let’s show that left distributivity together with commutativity of multiplication in Z implies right 
distributivity in Z. If we assume that we have left distributivity and commutativity of multiplication, 
then for integers k,m, and n, we have (m+n)-k=k(m+n)=k-m+k-n=m-k+4+n-k. 


We are now ready to give the more general definition of a ring. 

A ring is a triple (R, +, -), where R is a set and + and - are binary operations on R satisfying 
(1) (R, +) is a commutative group. 
(2) (R, -) isa monoid. 
(3) Multiplication is distributive over addition in R. That is, for all x,y,z E€ R, we have 


x-(ytz)=x-ytxu-z and (y+z):-x=y-x+Zz-x. 
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Recall: The symbol € is used for membership in a set. Specifically, the statement a € S can be read as 
“a is a member of the set S,” or more simply as “a is in S.” For example, 2 E€ N means “2 is in the set 
of natural numbers,” or more simply, “2 is a natural number.” 


We will always refer to the operation + as addition and the operation - as multiplication. We will also 
adjust our notation accordingly. For example, we will refer to the identity for + as 0, and the additive 
inverse of an element x € R as - x. Also, we will refer to the identity for - as 1, and the multiplicative 


; ai z 1 
inverse of an element x € R (if it exists) as x7? or = 


Notes: (1) Recall from Lesson 3 that (R, +) a commutative group means the following: 
e (Closure) Forallx,y ER,x+yER. 
e (Associativity) For all x,y,z E R, (x +y)+z=x+(y+z2). 
e (Commutativity) For all x,y E R,x +y =y+x. 
e (Identity) There exists an element 0 E R such that forallx E R,O0+x=x+0=x. 


e (Inverse) For each x E R, there is - x E R such that x + (-x) = (-x)+x = 0. 


(2) Recall from Lesson 3 that (R, -) a monoid means the following: 
e (Closure) Forallx,yER,x-yeER. 
e (Associativity) For all x,y,z €R,(x-y):z=x-(y-z). 
e (Identity) There exists an element 1 E R such that forallx E R,1-x=x-1=x. 


(3) Although commutativity of multiplication is not required for the definition of a ring, our most 
important example (the ring of integers) satisfies this condition. When multiplication is commutative 
in R, we call the ring a commutative ring. In this case we have the following additional property: 


e (Commutativity) Forallx,yER,x-y=y-x. 


(4) Observe that we have two distributive properties in the definition for a ring. The first property is 
called left distributivity and the second is called right distributivity. 


(5) In a commutative ring, left distributivity implies right distributivity and vice versa. In this case, the 
distributive property simplifies to 


e (Distributivity) For allx,y,zER,x-(yv+z)=x-ytx-z 


(6) Some authors leave out the multiplicative identity property in the definition of a ring and call such 
a ring a unital ring or a ring with identity. Since we are mostly concerned with the ring of integers, we 
will adopt the convention that a ring has a multiplicative identity. If we do not wish to assume that R 
has a multiplicative identity, then we will call the structure “almost a ring” or rng (note the missing “i”). 


(7) The properties that define a ring are called the ring axioms. In general, an axiom is a statement that 
is assumed to be true. So, the ring axioms are the statements that are given to be true in all rings. There 
are many other statements that are true in rings. However, any additional statements need to be 
proved using the axioms. 
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Example 4.2: 


1. 


(Z,+, -) is a commutative ring with additive identity 0 and multiplicative identity 1. The 
additive inverse of an integer a is the integer - a. This is the ring we will be focusing most of our 
attention on. See Example 4.1 for more details. 


(N, +, -) is not a ring because (N, +) is not a group. The only group property that fails is the 
additive inverse property. For example, the natural number 1 has no additive inverse. That is, 
n+ 1 = 0 has no solution in N. Note that (N, -) isa commutative monoid and the distributive 
property holds in N. Therefore, (N, +, -) misses being a commutative ring by just that one 
property. (N, +, -) isan example of a structure called a semiring. 


Recall from Example 3.6 (4 and 5) that the set of rational numbers is Q = {£ a,b E Z,b + 0} 
f ease nir R a c _ ad+bc ac_ac 
and we define addition and multiplication on Q by + a a and a a 


(Q, +, -) is a commutative ring with additive identity 0 = - and multiplicative identity 1 = Z 


ote . z a. Š -a 
The additive inverse of a rational number 5 is the rational number z 


Q has one additional property not required in the definition of a ring. Every nonzero element 


of Q has a multiplicative inverse. The inverse of the nonzero rational number > is the rational 
b anon z a b ab ab 1 ba ba ab 1 

number T This is easy to verify: > 2 a a 1 and z5 ao o ıı 1. So, 

(Q*, -) is a commutative group, where Q* is the set of nonzero rational numbers. 

If we replace the condition “(R, -) is a monoid” in the definition of a ring (condition 2) with the 

condition (R*, -) is a commutative group, we get a structure called a field. By the remarks in 


the last paragraph, we see that (Q,+, -) isa field. 


Technical note: The definition of semiring has one additional property: 0 -x = x -0 = 0. Without the 
additive inverse property this new property does not follow from the others, and so, it must be listed 
explicitly. 


Divisibility 


An integer a is called even if there is another integer b such that a = 2b. 


Example 4.3: 
1. 6is even because 6 = 2:3. 
2. -14 is even because - 14 = 2: (- 7). 
3. We can write 1 = 2 5, but this does not show that 1 is even (and as we all know, it is not). In 


the definition of even, it is very important that b is an integer. The problem here is that zis not 
an integer, and so, it cannot be used as a value for b in the definition of even. 


We define the sum of integers a and b to be a + b. We define the product of a and b to bea: b. 


Theorem 4.1: The sum of two even integers is even. 
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Strategy: Before writing the proof, let’s think about our strategy. We need to start with two arbitrary 
but specific even integers. Let’s call them m and n. Notice that we need to give them different names 
because there is no reason that they need to have the same value. 


When we try to add m and n, we get m + n. Hmmm...| see no reason yet why the expression m+n 
should represent an even integer. 


The problem is that we haven’t yet used the definition of even. If we invoke the definition, we get 
integers j and k such that m = 2j andn = 2k. 


Now, when we add m and n, we getm +n = 2j + 2k. 


Is it clear that 2j + 2k represents an even integer? Nope...not yet. To be even, our final expression 
needs to have the form 2b, where b is an integer. 


Here is where we use the fact that (Z, +, -) is a ring. Specifically, we use the distributive property to 
rewrite 2j + 2k as 2(j + k). 


It looks like we’ve done it. We just need to verify one more thing: is j + k an integer? Once again, we 
can use the fact that (Z, +, -) is a ring to verify this. Specifically, we use the fact that + is a binary 
operation on Z. 


| think we’re now ready to write the proof. 


Proof of Theorem 4.1: Let m and n be even integers. Then there are integers j and k such that m = 2j 
andn = 2k.So,m+n = 2j + 2k = 2(j + k) because multiplication is distributive over addition in Z. 
Since Z is closed under addition, j + k € Z. Therefore, m + n is even. o 


The property of being even is a special case of the more general notion of divisibility. 


An integer a is divisible by an integer k, written k|a, if there is another integer b such that a = kb. We 
also say that k is a factor of a, k is a divisor of a, k divides a, or a is a multiple of k. 
Example 4.4: 

1. Note that being divisible by 2 is the same as being even. 

2. 18 is divisible by 3 because 18 = 3-6. 

3. -56 is divisible by 7 because - 56 = 7- (- 8). 


Theorem 4.2: The product of two integers that are each divisible by k is also divisible by k. 


Proof: Let m and n be integers that are divisible by k. Then there are integers b and c such that 
m=kb and n=kc. So, m-n=(k-b)-(k-c)=k-(b-(k-c)) because multiplication is 
associative in Z. Since Z is closed under multiplication, b - (k-c) € Z. Thus, m- nis divisible by k. oO 


Notes: (1) If you’re confused about how associativity was used here, it might help to make the 
substitution u = (k-c). Then we have (k -b)- (k-c) =(k-b)-u=k-:(b-u)= k(b -(k- c)). 
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(2) Although it may seem tempting to simplify k - (b -(k- c)) further, it is unnecessary. The definition 
of divisibility by k requires us only to generate an expression of the form k times some integer, and 
that’s what we have done. 


(3) If the generality of the proof confuses you, try replacing k by a specific integer. For example, if we 
let k = 2, we have m = 2b, n = 2c, and therefore m- n = (2b) - (2c) = 2(b . (2c)). Is it clear that 
this final expression is even (divisible by 2)? 


(4) It’s worth noting that the product m - n is actually divisible by k?. Indeed, we have 


m:n=k-(b-(k-c))=k-((-k)-c) =k; ((k-b)-c)= k- (k-(b-c)) = k?(b- c) 


Induction 


The Well Ordering Principle says that every nonempty subset of natural numbers has a least element. 
For example, the least element of N itself is 0. 


Theorem 4.3 (The Principle of Mathematical Induction): Let S be a set of natural numbers such that 
(i) 0 € Sand (ii) foralk EN,k ES >k+1€ES.ThenS =N. 


Notes: (1) The Principle of Mathematical Induction works like a chain reaction. We know that 0 € S 
(this is condition (i)). Substituting 0 in for k in the expression “k E€ S > k + 1 € S” (condition (ii)) gives 
us 0 E€ S > 1 E S. So, we have that 0 is in the set S, and “if 0 is in the set S, then 1 is in the set S.” So, 
1 € S must also be true. 


(2) In terms of Lesson 1 on Sentential Logic, if we let p be the statement 0 € S and q the statement 
1 E S, then we are given that p A (p > q) is true. Observe that the only way that this statement can 
be true is if q is also true. Indeed, we must have both p = T and p > q =T. If q were false, then we 
would have p > q = T > F = F. So, we must have q =T. 


(3) Now that we showed 1 E S is true (from Note 1 above), we can substitute 1 for k in the expression 
“k E S > k +1 ES” (condition (ii)) to get 1 E S > 2 E€ S. So, we have LESA(LES >2€ES) is 
true. So, 2 E€ S must also be true. 


(4) In general, we get the following chain reaction: 
ESSES >22 ESSI ESS 


| hope that the “argument” presented in Notes 1 through 4 above convinces you that the Principle of 
Mathematical Induction should be true. Now let’s give a proof using the Well Ordering Principle. Proofs 
involving the Well Ordering Principle are generally done by contradiction. 


Proof of Theorem 4.3: Let S be a set of natural numbers such that 0 € S (condition (i)), and such that 
whenever KES, k+1€ES (condition (ii)). Assume toward contradiction that S + N. Let 
A = {k E N | k ¢ S} (so, A is the set of natural numbers not in S). Since S + N, A is nonempty. So, by 
the Well Ordering Principle, A has a least element, let’s call it a. a # 0 because 0 E S anda € S. So, 
a— 1 EN. Letting k = a — 1, we havea -1 ES>kES>k+1ES—>(a-1)+1ES>aE€s. 
But a E A, which means that a ¢ S. This is a contradiction, and so, S = N. o 
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Note: The proof given here is a proof by contradiction. A proof by contradiction works as follows: 
1. We assume the negation of what we are trying to prove. 
2. We use a logically valid argument to derive a statement which is false. 


3. Since the argument was logically valid, the only possible error is our original assumption. 
Therefore, the negation of our original assumption must be true. 


In this problem we are trying to prove that S = N. The negation of this statement is that S + N, and so 
that is what we assume. 


We then define a set A which contains elements of N that are not in S. In reality, this set is empty 
(because the conclusion of the theorem is S = N). However, our (wrong!) assumption that S # N tells 
us that this set A actually has something in it. Saying that A has something in it is an example of a false 
statement that was derived from a logically valid argument. This false statement occurred not because 
of an error in our logic, but because we started with an incorrect assumption (S # N). 


The Well Ordering Principle then allows us to pick out the least element of this set A. Note that we can 
do this because A is a subset of N. This wouldn’t work if we knew only that A was a subset of Z, as Z 
does not satisfy the Well Ordering Principle (for example, Z itself has no least element). 


Again, although the argument that A has a least element is logically valid, A does not actually have any 
elements at all. We are working from the (wrong!) assumption that S # N. 


Once we have our hands on this least element a, we can get our contradiction. What can this least 
element a be? Well a was chosen to not be in S, so a cannot be 0 (because 0 is in S). Also, we know 
that a — 1 E S (because a is the least element not in S). But condition (ii) then forces a to be in S 
(because a = (a — 1) + 1). 


So, we wind up with a E S, contradicting the fact that a is the least element not in S. 


The Principle of Mathematical Induction is often written in the following way: 


(x) Let P(n) be a statement and suppose that (i) P (0) is true and (ii) for all k € N, P(k) > P(k + 1). 
Then P(n) is true for alln E N. 


In Problem 9 below, you will be asked to show that statement (x) is equivalent to Theorem 4.3. 


There are essentially two steps involved in a proof by mathematical induction. The first step is to prove 
that P(0) is true (this is called the base case), and the second step is to assume that P(k) is true, and 
use this to show that P(k + 1) is true (this is called the inductive step). While doing the inductive step, 
the statement “P(k) is true” is often referred to as the inductive hypothesis. 


Subtraction in Z: For x, y € Z, we define the difference x — y to be equal to the sum x + (- y). For 
example, n? — n = n? + (-n) (where n? is defined to be the product n - n). 


Example 4.5: Let’s use the Principle of Mathematical Induction to prove that for all natural numbers n, 
2 . 
n* — nis even. 
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Base Case (k = 0): 0? — 0 = 0 = 2-0. So, 0? — 0 is even. 
Inductive Step: Let k € N and assume that k? — k is even. Then k? — k = 2b for some integer b. Now, 
(k +1)? -(k+1) =(kK+D[kK4+1)-1] = (k + [k+ (1 -1)] = (k+ 1)(k + 0) 
=(k+1)-k=k?+k=(k?—k)+2k = 2b + 2k = 2(b + k). 


Here we used the fact that (Z, +, -) is a ring. Since Z is closed under addition, b + k € Z. Therefore, 
(k +1)? — (k +1) is even. 


By the Principle of Mathematical Induction, n? — n is even for alln € N. oO 


Notes: (1) Instead of listing every property that we used at each step, we simply stated that all the 
computations we made were allowed because (Z, +, +) is a ring. We will discuss the property we used 
at each step in the notes below. 


(2) We first used left distributivity to rewrite (k + 1)? — (k + 1) as (k + 1)[(k + 1) — 1]. If you have 
trouble seeing this, try working backwards, and making the substitutions x = (k + 1), y= (k + 1), 
and z =-1. We then have 


(k+ Dik +1)— 1] = (k+ 1)[(k +1) + (-1)] = x(y + z) = xy + xz 
=(k+1)(k+1)+(k+1)(-1)= (k +1} +(-1)(k +1) = (k +1) — (k +1). 


Notice how we also used commutativity of multiplication for the second to last equality. 


(3) For the second algebraic step, we used associativity of addition to write 


(k+1)-1=(k+1)+(-1)=k+(1+(-1))=k+(1-1). 


(4) For the third algebraic step, we used the inverse property for addition to write 


1—-1=1+(-1)=0. 
(5) For the fourth algebraic step, we used the additive identity property to write k + 0 = k. 


(6) For the fifth algebraic step, we used right distributivity and the multiplicative identity property to 
write(K+1)-k=k-k+1-k=k? +k. 


(7) For the sixth algebraic step, we used what | call the “Standard Advanced Calculus Trick.” | 
sometimes abbreviate this as SACT. The trick is simple. If you need something to appear, just put it in. 
Then correct it by performing the opposite of what you just did. 


In this case, in order to use the inductive hypothesis, we need k? — k to appear, but unfortunately, we 
have k? + k instead. Using SACT, | do the following: 


e | simply put in what I need (and exactly where | need it): k? — k +k. 
e Now, | undo the damage by performing the reverse operation: k? — k +k +k. 


e Finally, | leave the part | need as is, and simplify the rest: (k? — k) + 2k 
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(8) For the seventh step, we simply replaced k? — k by 2b. We established that these two quantities 
were equal in the second sentence of the inductive step. 


(9) For the last step, we used left distributivity to write 2b + 2k as 2(b + k). 


Sometimes a statement involving the natural numbers may be false for 0, but true from some natural 
number on. In this case, we can still use induction. We just need to adjust the base case. 


Example 4.6: Let’s use the Principle of Mathematical Induction to prove that n? > 2n + 1 forall natural 
numbers n = 3. 


Base Case (k = 3): 3? = 9 and 2 -3 +1 = 6 + 1 = 7. So, 3? > 2-3 +1. 
Inductive Step: Let k € N with k > 3 and assume that k? > 2k + 1. Then we have 
(kK+ 1)? = (k+1)(k+1)=(k+1)k+(k+1)(1)=k?+k+k+1>(2k+1)+k+k+1 
=2k+2+k+k=2(k+1)+k+k22(k+1)+1(becausek +k >3+3=621). 


By the Principle of Mathematical Induction, n? > 2n + 1 for alln E€ N withn = 3. oO 


Notes: (1) If we have a sequence of equations and inequalities of the form =, =>, and > (with at least 
one inequality symbol appearing), beginning with a and ending with b, then the final result is a > b if 
> appears at least once and a = b otherwise. 


For example, if a=j=h=m>n=p=q2b, then a> b. The sequence that appears in the 
solution above has this form. 


(k+1)? =(k+1)(kK +1) = (k4+Dk+(kK4+1)() =k? +k+k41>(2k4+1)+k4+k41 
=2k+2+kt+k=2(k+1)+k+k>2(k+1)+1 


(2) By definition, x? = x - x. We used this in the first equality in the inductive step to write (k + 1)? as 
(k +1)(k + 1). 


(3) For the second equality in the inductive step, we used left distributivity to write (k + 1)(k +1) as 
(k +1)k + (k + 1)(1). If you have trouble seeing this, you can make a substitution like we did in Note 
2 following Example 4.5. 


(4) For the third equality in the inductive step, we used right distributivity to write (k + 1)k as 
k-k+1-k =k? +k. Wealso used the multiplicative identity property to write (k +1)(1) =k +1. 


(5) Associativity of addition is being used when we write the expression k? +k + k + 1. Notice the 
lack of parentheses. Technically speaking, we should have written (k? + k) + (k + 1) and then taken 
another step to rewrite this as k? + (k +(k+ 1)). However, since we have associativity, we can 
simply drop all those parentheses. 


(6) The inequality “k? +k+k+1>(2k+1)+k+k+1” was attained by using the inductive 
hypothesis “k? > 2k + 1.” 
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(7) The dedicated reader should verify that the remaining equalities in the proof are valid by 
determining which ring properties were used at each step. 


Example 4.7: Let’s use the Principle of Mathematical Induction to prove that for every natural number 
n, there is a natural number j such that n = 2j orn = 2j +1. 
Base Case (k = 0):0 =2:-0 


Inductive Step: Suppose that k € N and there is j € N such that k = 2j ork = 2j + 1. If k = 2j, then 
k4+1=2j4+1.lfk = 2j+1,thenk+1= (2j+1)+1=2j+(1+1)=2j+2 = 2(j+ 1). Here 
we used the fact that (N, +, -) is a semiring (more specifically, we used associativity of addition in N 
and distributivity of multiplication over addition in N). Since N is closed under addition, j + 1 E€ N. 


By the Principle of Mathematical Induction, for every natural number n, there is a natural number j 
such that n = 2j orn = 2j + 1. Oo 


Notes: (1) We can now prove the analogous result for the integers: “For every integer n, there is an 
integer j such that n = 2j orn = 2j + 1.” 


We already proved the result for n = 0. If n < 0, then -n > 0, and so there is a natural number j such 
that -n = 2j or -n = 2j + 1. If -n = 2j, then n = 2(-j) (and since j E€ N, -j € Z). If -n=2j +1, 
then n=-(2j+1)=-2j—1=-2j—1-— 1+1 (SACT)=-2j—2+1=2(-j—1)+ 1. Here we 
used the fact that (Z, +, -) is a ring. Since Z is closed under addition, -j — 1 =-j+ (-1) EZ. 


(2) If there is an integer j such that n = 2j, we say that n is even. If there is an integer j such that 
n = 2j + 1, we say that n is odd. 


(3) An integer n cannot be both even and odd. Indeed, if n = 2j and n = 2k + 1, then 2j = 2k + 1. 
So, we have 
2G —k) =2j —2k = (2k+1)—-2k=2k+(1—=2k)=2k+(-2k +1) 
= (2k —2k)+1=04+1=1. 


So, 2(j — k) = 1. But 2 does not have a multiplicative inverse in Z, and so, this is a contradiction. 
Theorem 4.4: The product of two odd integers is odd. 


Proof: Let m and n be odd integers. Then there are integers j and k such that m = 2j +1 and 
n = 2k + 1. So, 
m-n = (2j +1): (2k +1) = (2j +1)(2k) + (2j + 1)(1) = (2k)(2j + 1) + (2j + 1) 
= ( (2K) (2j) + 2k) + (2j + 1) = (2(k(2j)) + 2k) + (2j + 1) = 2(k(2j) + k) + (2j + 1) 
= (2(k(2j) + k) + 2j) +1 =2((k(2j) +k) +j) +1. 


Here we used the fact that (Z, +, -) is a ring. (Which properties did we use?) Since Z is closed under 
addition and multiplication, we have (k (2j) + k) + j € Z. Therefore, mn is odd. m 
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Problem Set 4 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 


1. The addition and multiplication tables below are defined on the set S = {0,1}. Show that 
(S,+, -) does not define a ring. 


+ | 0 1 hh 0 1 
0 0 1 0 1 0 
1 1 0 1 0 1 


2. Let S = {0,1} and define addition (+) and multiplication (-) so that (S, +, -) is a ring. Assume 
that 0 is the additive identity in S and 1 is the multiplicative identity in S. Draw the tables for 
addition and multiplication and verify that with these tables, (S, +, -) is a ring. 


LEVEL 2 


3. Use the Principle of Mathematical Induction to prove the following: 


(i) 2” >n forall natural numbers n = 1. 


n(n+1) 


Gi) O+F1424+-+-4+n= for all natural numbers. 


(iii) n! > 2” for all natural numbers n > 4 (where n! = 1-2-:-n for all natural numbers 
n = 1). 


(iv) 2” > n? for all natural numbers n > 4. 


4. Show that the sum of three integers that are divisible by 5 is divisible by 5. 


LEVEL 3 


5. Prove that if a, b,c € Z with a|b and b|c, then alc. 


6. Prove that n? — n is divisible by 3 for all natural numbers n. 


LEVEL 4 


7. Prove that ifa, b,c, d,e € Z with a|b and a|c, then a|(db + ec). 


8. Prove that 3” — 1 is even for all natural numbers n. 
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9. Show that Theorem 4.3 (the Principle of Mathematical Induction) is equivalent to the following 
statement: 


(x) Let P(n) be a statement and suppose that (i) P(0) is true and (ii) for all k EN, 
P(k) > P(k + 1). Then P(n) is true for all n E€ N. 


LEVEL 5 


10. The Principle of Strong Induction is the following statement: 


(xx) Let P(n) be a statement and suppose that (i) P(O) is true and (ii) for all k EN, 
Vj < k (PG)) > P(k + 1). Then P(n) is true for all n € N. 


Use the Principle of Mathematical Induction to prove the Principle of Strong Induction. 


11. Show that (Q, +, -) is a field. 


12. Use the Principle of Mathematical Induction to prove that for every n E€ N, if S is a set with 
|S| = n, then S has 2” subsets. (Hint: Use Problem 14 from Lesson 2.) 
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LESSON 5 - REAL ANALYSIS 
THE COMPLETE ORDERED FIELD OF REALS 


Fields 

Let’s review the number systems we have discussed so far. 
The set N = {0, 1, 2, 3, ... } is the set of natural numbers and the structure (N, +, -) is a semiring. 
The set Z = {...,-3,-2,-1,0, 1,2, 3, ... } is the set of integers and the structure (Z, +, -) is a ring. 
The set Q = G ja E Z, b € Z*} is the set of rational numbers and the structure (Q, +, -) is a field. 
And now let’s formally introduce the notion of a field (and we will review the definitions of ring and 
semiring in the notes below). 
A field is a triple (F, +, -), where F is a set and + and - are binary operations on F satisfying 

(1) (F, +) is a commutative group. 

(2) (F*, -) is a commutative group. 

(3) - is distributive over + in F. That is, for all x, y,z E F, we have 

x: (y+z)=x:y+x-z and (ytz)-x=y-x4+zZz-x. 

(4) O#1. 
We will refer to the operation + as addition, the operation - as multiplication, the additive identity as 
0, the multiplicative identity as 1, the additive inverse of an element x E F as - x, and the multiplicative 
inverse of an element x € F as x_*. We will often abbreviate x - y as xy. 
Notes: (1) Recall from Lesson 3 that (F, +) a commutative group means the following: 

e (Closure) Forallx,yEF,x+yeEF. 

e (Associativity) For all x,y,z E F, (x +y) +z =x + (y +2). 

e (Commutativity) For all x,y E F,x+y=y+x. 

e (Identity) There exists an element 0 E F such that forall x Ee F,0+x=x+0=x. 


e (Inverse) For each x E F, there is -x E F such that x + (-x) = (-x)+x = 0. 


(2) Similarly, (F*, -) a commutative group means the following: 
e (Closure) For all x,y € F*, xy € F*. 
e (Associativity) For all x,y,z E€ F*, (xy)z = x(yz). 
e (Commutativity) For all x,y E F*, xy = yx. 
e (Identity) There exists an element 1 E€ F* such that for all x E€ F*, 1x =x-1=x. 


e (Inverse) For each x € F*, there is x71 € F* such that xxt = x7tx = 1. 
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(3) Recall that F* is the set of nonzero elements of F. We can write F* = {x € F | x + 0} (pronounced 
“the set of x in F such that x is not equal to 0”) or F* = F \ {0} (pronounced “F with 0 removed”). 


(4) The properties that define a field are called the field axioms. These are the statements that are 
given to be true in all fields. There are many other statements that are true in fields. However, any 
additional statements need to be proved using the axioms. 


(5) If we replace the condition that “(F*, -) isa commutative group” by “(F, -) is a monoid,” then the 
resulting structure is called a ring. The most well-known example of a ring is Z, the ring of integers. See 
Lesson 4 for details about Z and rings in general. 


We also do not require 0 and 1 to be distinct in the definition of a ring. If 0 = 1, we get the zero ring, 
which consists of just one element, namely 0 (Why?). The operations of addition and multiplication are 
defined by 0 + 0 = 0 and 0- 0 = 0. The reader may want to verify that the zero ring is in fact a ring. 


The main difference between a ring and a field is that in a ring, there can be nonzero elements that do 
not have multiplicative inverses. For example, in Z, 2 has no multiplicative inverse. So, the equation 
2x = 1 has no solution. 


(6) If we also replace “(F, +) is a commutative group” by “(F, +) is a commutative monoid,” then the 
resulting structure is a semiring. The most well-known example of a semiring is N, the semiring of 
natural numbers. 


The main difference between a semiring and a ring is that in a semiring, there can be elements that do 
not have additive inverses. For example, in N, 1 has no additive inverse. Thus, the equation x + 1 = 0 
has no solution. 


Technical note: For a semiring, we include one additional axiom: Forallx E F,O-x =x-0=0. 


(7) Every field is a commutative ring. Although this is not too hard to show (you will be asked to show 
this in Problem 6 below), it is worth observing that this is not completely obvious. For example, if 
(F,+, -) is a ring, then since (F, -) is a monoid with identity 1, it follows that 1-0 =0-1=0. 
However, in the definition of a field given above, this property of 0 is not given as an axiom. We are 
given that (F*, -) isa commutative group, and so, it follows that 1 is an identity for F*. But 0 ¢ F*, and 
so, 1-0 =0-1 = 0 needs to be proved. 


Similarly, in the definition of a field given above, 0 is excluded from associativity and commutativity. 
These need to be checked. 


(8) You were asked to verify that (Q, +, -) is a field in Problems 9 and 11 from Lesson 3 and Problem 
11 from Lesson 4. 


Subtraction and Division: If a,b E€ F, we define a — b = a + (-b) and for b + 0, z = ab™t. 
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Ordered Rings and Fields 


We say that a ring (R,+, -) is ordered if there is a nonempty subset P of R, called the set of positive 
elements of R, satisfying the following three properties: 


(1) Ifa,b E P,thena +b EP. 
(2) Ifa,b E P, then ab EP. 
(3) If a E R, then exactly one of the following holds:a E P,a = 0,or-a EP. 


Note: If a E P, we say that a is positive and if -a E P, we say that a is negative. 
Also, we define R* = PandR = {a E R | -a E P}. 


Example 5.1: Let R = Z and let Pz = {1, 2, 3,...}. It’s easy to see that properties (1), (2), and (3) are 
satisfied. It follows that (Z, +, -) is an ordered ring. 


Theorem 5.1: (Q, +, -) is an ordered field. 


Note: The proof of this result is a bit technical, but | am including it for completeness. The student just 
starting out in pure mathematics can feel free to just accept this result and skip the proof. 


Recall: (1) Rational numbers have the form = where m and n are integers and n + 0. 


(2) Two rational numbers = and = are equal if and only if mq = np. 


p _ mq+np 
q nq 


(3) For rational numbers Z and a we define addition and multiplication by as and 


m p __ mp 


n q nq’ 


(4) The additive inverse of = if ee 
Te n n 


Analysis: Before writing out the proof in detail, let’s think about how we would go about it. First of all, 
we already know from Problem 11 in Lesson 4 that (Q, +, -) is a field. So, we need only show that it is 
ordered. To do this, we need to come up with a set P of positive elements from Q. The natural choice 
would be to take the set of quotients whose numerator (number on the top) and denominator (number 
on the bottom) are both positive integers. In other words, we will let Pg be the set of all the rational 


numbers of the form = where m and n are both elements of Pz (as defined in Example 5.1 above). 
Since = = Z (because (-m)n = (-n)m), we must automatically be including all quotients whose 


numerator and denominator are both negative integers as well. 
With this definition of Pg, it is straightforward to verify properties (1) and (2) of an ordered field. 


To verify property (3), we need to check three things. 


(i) For any rational number a, a is positive, zero, or negative (a E Pg, a = 0, or -a E Pg). We will 
show this by assuming a ¢ Pg and a + 0, and then proving that we must have -a €E Pg. 
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(ii) For any rational number a, a cannot be both positive and negative. We will show this by 
assuming a € Pg and -a E Pg, and then deriving a contradiction. 


(iii) A positive or negative rational number is not zero, and a rational number that is zero is not 
positive or negative. This is straightforward to check. 
Let’s write out the details. 


Proof of Theorem 5.1: By Problem 11 from Lesson 4, (Q, +, -) is a field. 


Let F = Q and let Pg = {xEQ\|x= = with m,n E Pz}. Leta,b E Pg. Then there are m,n, p,q E Pz 
with a =" and b= = We have a+b= Phas a Since Pz satisfies (2) above, we have 
mq, np,nq E Pz. Since Pz satisfies (1) above, we have mq + np € Pz. Therefore, a + b € Pg and (1) 
m p 


holds. Also, we have ab = m 
ab € Pg and (2) holds. 


= a Since Pz satisfies (2) above, we have mp, nq E Pz, and therefore, 


Now, suppose a ¢ Pg and a + 0. Since a E€ Q, there are m E Z and n E Z* such that a = = But 
a + 0, and so, we must have m E Z*. Since a ¢ Pg, either m ¢ Pz or n ¢ Pz (or both). If both m ¢ Pz 
andn € Pz, then we havea = = = = (because m(-n) = n(-m)). Then-m,-n E Pz, andso,a E Pa 
contrary to our assumption that a ¢ Po. If m ¢ Pz and n E Pz, then -m E Pz, and therefore, 
-a = = E Pg. lfm E Pz andn ¢ Pz, then -n € Pz, and therefore, - a = = = E Pg. So, at least one 


n 
ofa E P,a = 0, or -a E P holds. 


If a E Po and -a E Po, then a= and -a =£ with m,n,p,q E Pz. We can also write -a as 
Q Q 7 F 
-a = = So, = = > and thus, (-m)q = np. Since n, p € Pz, we have np E Pz. Since (-m)q = np, we 


must have (-m)q E Pz. But -m ¢ Pz, and so, - (- m) € Pz. Since we also have q € Pz, we must have 
-(-m)q E Pz. But then by (3) for Pz, (-m)q ¢ Pz. This contradiction shows that we cannot have both 
a € Pg and -a E Po. 
Ifa E Po, then a = = with m,n E Pz. So, m + 0, and therefore, a + 0. If -a E Pa, then -a = Z with 
m,n E Pz. If a = 0, then -a = 0, and so, m = 0. But m E Pz, and so, m # 0. Thus, a # 0. 

= =! See 
If a = 0, then we have 0 = ¢ Po and - 0 = a =7€ Pq. 


It follows that (Q, +, -) is an ordered field. Oo 


If (R, +, +) is an ordered ring and P is the set of positive elements from the ring, we will write a > 0 
instead of a E P anda < 0 instead of -a E P. If a — b > 0, we will writea > borb <a. 


We write a > O ifa E P ora = 0, we write a < 0 if -a E P ora = 0, and we writea > borb < aif 
a—b> o0. 


We may use the notation (R, <) for an ordered ring, where < is the relation defined in the last 
paragraph. Note that + and - aren’t explicitly mentioned, but of course they are still part of the ring. 
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In the future, we may just use the name of the set for the whole structure when there is no danger of 
confusion. For example, we may refer to the ring R or the ordered field F instead of the ring (R, +, -) 
or the ordered field (F, <). 


Fields are particularly nice to work with because all the arithmetic and algebra we’ve learned through 
the years can be used in fields. For example, in the field of rational numbers, we can solve the equation 
2x = 1. The multiplicative inverse property allows us to do this. Indeed, the multiplicative inverse of 2 


1 1. ; ; ; ; ; 2 
is z, and therefore, x = zisa solution to the given equation. Compare this to the ring of integers. If we 
restrict ourselves to the integers, then the equation 2x = 1 has no solution. 


Working with ordered fields is very nice as well. In the problem set below, you will be asked to derive 
some additional properties of fields and ordered fields that follow from the axioms. We will prove a 
few of these properties now as examples. 


Theorem 5.2: Let (F, <) be an ordered field. Then for all x € F*,x-x > 0. 


Proof: There are two cases to consider: (i) If x > 0, then x - x > 0 by property (2) of an ordered field. 
(ii) If x < 0, then -x > 0, and so, (- x)(- x) > 0, again by property (2) of an ordered field. Now, using 
Problem 3 (parts (vi) and (vii)) in the problem set below, together with commutativity and associativity 
of multiplication, and the multiplicative identity property, we have 


(-x)(-x) = (-1x)(-1x) = (-1)(-1)x:-x=1(x-x)=x:x. 
So, again we have x: x > 0. oO 


Theorem 5.3: Every ordered field (F, <) contains a copy of the natural numbers. Specifically, F contains 
a subset N = {n | n € N} such that for all n,m E N, we have n+ m=n+m,n-m=N-™M, and 
n<men<m. 

Proof: Let (F, <) be an ordered field. By the definition of a field, 0,1 E€ F and 0 # 1. 

We let 0 = O and 7 = 1 + 1 + -+ + 1, where 1 appears n times. Let N = {7 | n E N}. Then N CF. 
We first prove by induction on m that for alln,m EN,n+m=n+mMm. 

Base case (k = 0):n+0=n=n+0=7n+0. 

Inductive step: Suppose that n + k = 7+ k. Then we have 


n+(k+1)=(n+k)+1=n+k+1=(n+k)+1=7n+(k+1)=n+k+1. 


By the Principle of Mathematical Induction, for all natural numbers m,n +m=n+™m. 


Sl 


Similarly, we prove by induction on m that for alln,nm EN, n-m=n- 


Base case (k = 0):n-0=0=7:-0. 
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Inductive step: Suppose that n-k =n- k. Then we have 
n-(kt+D=nk+n=nkt+n=n-k+n=n(k+1) =n(k+1) =n(k +1). 
By the Principle of Mathematical Induction, for all natural numbers m,n-m=n-™m. 


We now wish to prove that for alln,n EN,n<mon<m. 


We first note that for all nEN, n+1>n because n+1-—n=n+1—-—n=1=1-:1>0 by 
Theorem 5.2. 


We now prove by induction on n that for alln E€ N with n > 0 that n > 0. 
Base case (k = 1): 1 = 1 = 1-1 > 0 by Theorem 5.2. 


Inductive step: Assume that k>0. Then k+1=k+1=k+1>0. Here we have used Order 
Property 1 together with k >Oand1>0. 


By the Principle of Mathematical Induction, for all natural numbers n with n > 0, we haven > 0. 
Conversely, ifn > 0, then n # 0 (because 0= 0). Since n is defined only for n > 0, we haven > 0. 


So, we have shown that for n E€ N,n > Oif and only ifn > 0. 


Next, note that ifn < m, then m = (m-n) +n = m=n +n. It follows that m—n = ™Mm-—N. 
Finally, we haven <m em-n>0em-n>l0em-n>l0em>nen<m. oO 


Notes: (1) The function that sends n E N to 7 € N is called an isomorphism. It has the following 
properties: (i) n+m=n+™, (ii) n-m =n.: m, and (iii) n <m if and only if n < m. The function 
gives a one-to-one correspondence between the elements of N and the elements of N. 


So, when we say that every field contains a “copy” of the natural numbers, we mean that there is a 
subset N of the field so that (N, <) is isomorphic to (N, <) (note that addition and multiplication are 
preserved as well, even though they’re not explicitly mentioned in the notation). 


(2) We will formally introduce isomorphisms in Lesson 11. 


Theorem 5.4: Let (F, <) be an ordered field and let x E F with x > 0. Then = > 0. 


f 1 = . A 
Proof: Since x # 0, z5x 1 exists and is nonzero. 
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Assume toward contradiction that Z < 0. Then -= > 0. Using Problem 3 (part (vi)) from the problem 


set below, together with commutativity and associativity of multiplication, the multiplicative inverse 


property, and the multiplicative identity property, x (- ~) = x(-1)x7t = -1xx7t = -1-1 =i, 


x 


Since x > 0 and -< > 0, we have -1 = x (-=) > 0. So, 1 + 0. But by Theorem 5.2, 1 = 1-1 > 0. This 


s Peas 1 
is a contradiction. Therefore, z >0. o 


Why Isn’t Q Enough? 


At first glance, it would appear that the ordered field of rational numbers would be sufficient to solve 
all “real world” problems. However, a long time ago, a group of people called the Pythagoreans showed 
that this was not the case. The problem was first discovered when applying the now well-known 
Pythagorean Theorem. 


Theorem 5.5 (Pythagorean Theorem): In a right triangle with legs of lengths a and b, and a hypotenuse 
of length c, c? = a? + b?. 


The picture to the right shows a right triangle. The vertical and 
horizontal segments (labeled a and b, respectively) are called the legs 
of the right triangle, and the side opposite the right angle (labeled c) is 
called the hypotenuse of the right triangle. 


There are many ways to prove the Pythagorean Theorem. Here, we will b 
provide a simple geometric argument. For the proof we will want to 
recall that the area of a square with side length s is A = s, and the area of a triangle with base b and 


height h is A = = bh. Notice that in our right triangle drawn here, the base is labeled b (how 


convenient), and the height is labeled a. So, the area of this right triangle is A = ~ba = = ab. 


Proof of Theorem 5.5: We draw 2 squares, each of side length a + b, by rearranging 4 copies of the 
given triangle in 2 different ways: 


b a a b 
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We can get the area of each of these squares by adding the areas of all the figures that comprise each 
square. 


The square on the left consists of 4 copies of the given right triangle, a square of side length a and a 
square of side length b. It follows that the area of this square is 4 - = ab + aĉ + b? = 2ab +a? +b?. 


The square on the right consists of 4 copies of the given right triangle, and a square of side length c. It 
follows that the area of this square is 4 - -ab +c? = 2ab + c°. 


Since the areas of both squares of side length a + b are equal (both areas are equal to (a + b)?), 
2ab + a? + b? = 2ab + c?. Cancelling 2ab from each side of this equation yields a? + b? = c?. o 


Question: In a right triangle where both legs have length 1, what is the length of the hypotenuse? 


Let’s try to answer this question. If we let c be the length of the hypotenuse of the triangle, then by the 
Pythagorean Theorem, we have c? = 1? + 1? = 1 + 1 = 2. Since c? = c » c, we need to find a number 
with the property that when you multiply that number by itself you get 2. The Pythagoreans showed 
that if we use only numbers in Q, then no such number exists. 


Theorem 5.6: There does not exist a rational number a such that a? = 2. 


Analysis: We will prove this Theorem by assuming that there is a rational number a such that a? = 2, 
and arguing until we reach a contradiction. A first attempt at a proof would be to let a = = E Qsatisfy 
mm m m 


: 2 2 2 2 2 
(=) = 2. It follows that m? = 2n? (C = ™™ MM _ 6 sae a c ama] 
n n2 nn non n 1 n2 1 


showing that m? is even. We will then use this information to show that both m and n are even (at 
this point, you may want to try to use the two statements in bold to prove this yourself). 


Now, in our first attempt, the fact that m and n both turned out to be even did not produce a 
contradiction. However, we can modify the beginning of the argument to make this happen. 


: oe : 6. 
Remember that every rational number has infinitely many representations. For example, a the same 


: 2 f ; ; 

rational number as F (because 6:4 = 12 - 2). Notice that in both representations, the numerator 
(number on the top) and the denominator (number on the bottom) are even. However, they are both 
equivalent to = which has the property that the numerator is not even. 


In Problem 9 below, you will be asked to show that every rational number can be written in the form 
a where at least one of m or n is not even. We can now adjust our argument to get the desired 
contradiction. 


Proof of Theorem 5.6: Assume, toward contradiction, that there is a rational number a such that 
2 . . . * m 
a’ = 2. Since a is a rational number, there are m E Z andn E Z*, not both even, so that a = P 
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2 
So, we have — at ao 

n nn n 
m? is even. If m were odd, then by Theorem 4.4 (from Lesson 4), m? = m -m would be odd. So, m is 


even. 


=a-a=Q@°=2= =. Thus, m? -1 = n? - 2. So, m? = 2n?. Therefore, 


Since m is even, there is k € Z such that m = 2k. Replacing m by 2k in the equation m? = 2n? gives 
us 2n? = m? = (2k)? = (2k) (2k) = 2(k(2k)). So, n? = k(2k) = (k - 2)k = (2k)k = 2(k - k). So, 
we see that n? is even, and again by Theorem 4.4, n is even. 


So, we have m even and n even, contrary to our original assumption that m and n are not both even. 
Therefore, there is no rational number a such that a? = 2. o 


So, the big question is, “Is there an ordered field F with F containing Q and a € F such that a? = 2?” 
Spoiler Alert! There is! We call it R, the ordered field of real numbers. 


Completeness 


Let (F, <) be an ordered field and let S be a nonempty subset of F. We say that S is bounded above if 
there is M E F such that for all s E S, s < M. Each such number M is called an upper bound of S. 


In words, an upper bound of a set S is simply an element from the field that is at least as big as every 
element in S. 


Similarly, we say that S is bounded below if there is K E F such that for all s E S, K < s. Each such 
number K is called a lower bound of S. 


In words, a lower bound of a set S is simply an element from the field that is no bigger than any element 
in S. 


We will say that S is bounded if it is both bounded above and bounded below. Otherwise S is 
unbounded. 


A least upper bound of a set S is an upper bound that is smaller than any other upper bound of S, and 
a greatest lower bound of S is a lower bound that is larger than any other lower bound of S. 


Example 5.2: Let (F, <) be an ordered field with Q E F. 


Note: The only two examples of F that we are interested in right now are Q (the set of rational 
numbers) and R (the set of real numbers). Although we haven’t finished defining the real numbers, 
you probably have some intuition as to what they look like—after all, this is the number system you 
have used throughout high school. As you look at the set in each example below, think about what it 
looks like as a subset of Q and as a subset of R. 


1. S = {1,2,3,4,5} is bounded. 


5 is an upper bound of S, as is any number larger than 5. The number 5 is special in the sense 
that there are no upper bounds smaller than it. So, 5 is the least upper bound of S. 
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Similarly, 1 is a lower bound of S, as is any number smaller than 1. The number 1 is the greatest 
lower bound of S because there are no lower bounds larger than it. 


Notice that the least upper bound and greatest lower bound of S are inside the set S itself. This 
will always happen when the set S is finite. 


2. T={x €F |-2 <x < 2}is also bounded. Any number greater than or equal to 2 is an upper 
bound of T, and any number less than or equal to - 2 is a lower bound of T. 


2 is the least upper bound of T and -2 is the greatest lower bound of T. 
Note that the least upper bound of T is in T, whereas the greatest lower bound of T is notin T. 


3. U = {x EF | x <-3}is bounded above by any number greater than or equal to - 3, and - 3 is 
the least upper bound of U. The set U is not bounded below, and therefore, U is unbounded. 


4. V = {x E F | x? < 2} is bounded above by 2. To see this, note that if x > 2, then x? > 4 > 2, 
and therefore, x ¢ V. Any number greater than 2 is also an upper bound. 


Is 2 the least upper bound of V? It’s not! For example, : is also an upper bound. Indeed, if 


x> =, then x? > 2 > 2 (the reader should verify that for all a,b € Rt, a > b > a? > b?). 


Does V have a least upper bound? A moment’s thought might lead you to suspect that a least 
upper bound M would satisfy M? = 2. And it turns out that you are right! (Proving this, 
however, is quite difficult). Clearly, this least upper bound M is not in the set V. The big question 
is “Does M exist at all?” 


Well, if F = Q, then by Theorem 5.6, M does not exist in F. In this case, V is an example of a 
set which is bounded above in Q, but has no least upper bound in Q. 


So, if we want an ordered field F containing Q where M does exist, we can insist that F has the 
property that any set which is bounded above in F has a least upper bound in F. It turns out 
that there is exactly one such ordered field (up to renaming the elements) and we call it the 
ordered field of real numbers, R. 


Many authors use the term supremum for “least upper bound” and infimum for “greatest lower 
bound,” and they may write sup A and inf A for the supremum and infimum of a set A, respectively (if 
they exist). 


In the examples above, we stated the least upper bound and greatest lower bound of the sets S, T, U, 
and V without proof. Intuitively, it seems reasonable that those numbers are correct. Let’s do one of 
the examples carefully. 


Theorem 5.7: Let U = {x E F | x < - 3}. Then sup U =-3. 


Analysis: We need to show that -3 is an upper bound of U, and that any number less than -3 is not 
an upper bound of U. That -3 is an upper bound of U follows immediately from the definition of U. 


The harder part of the argument is showing that a number less than -3 is not an upper bound of U. 
However, conceptually it’s not hard to see that this is true. If a < - 3, we simply need to find some 
number x between a and - 3. Here is a picture of the situation. 
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Notice that a can be very close to - 3 and we don’t know exactly what a is—we know only that it’s less 
than - 3. So, we need to be careful how we choose x. The most natural choice for x would be to go 
midway between a and -3. In other words, we can take the average of a and -3. So, we will let 


x= =(a +(- 3)). Then we just need to verify that a < x and that x € U (that is, x < - 3). 


Proof of Theorem 5.7: If x € U, then x < -3 by definition, and so, - 3 is an upper bound of U. 


Suppose that a < -3 (or equivalently, - a — 3 > 0). We want to show that a is not an upper bound of 
U. To do this, we let x = =(a — 3) =271(a + (-3)). x € F because F is closed under addition and 
multiplication, and the multiplicative inverse property holds in F*. We will show that a < x <-3. 


1 1 1o 1 i 1 
x—-a=>5(a—3)—-a= 5(a—3)—5 (2a) =5(a- 3- 2a) =; (a - 2a- 3) = 5 (-a - 3). 


Since = > 0 (by Theorem 5.4) and -a — 3 > 0, it follows that x — a > 0, and therefore, x > a. 


3 mE N: Je a ere ee ea a iae- 3 
= x=- 7 (a I= 58) 5° 7 =b a )=53Ca ). 


Again, since = > 0and-a-—3 > 0, it follows that - 3 — x > 0, and therefore, x < - 3. Thus, x E U. 


So, we found an element x E U (because x < - 3) witha < x. This shows that a is not an upper bound 
of U. It follows that - 3 = sup U. oO 


An ordered field (F, <) has the Completeness Property if every nonempty subset of F that is bounded 
above in F has a least upper bound in F. In this case, we say that (F, <) is a complete ordered field. 


Theorem 5.8: There is exactly one complete ordered field (up to renaming the elements). 


The proof of Theorem 5.8 is quite long and requires some machinery that we haven’t yet developed. 
We will therefore accept it as true for the purpose of this book, and we let R be the unique complete 
ordered field guaranteed to exist by the theorem. 


We will finish this section by proving two useful theorems about the complete ordered field R. 
Theorem 5.9 (The Archimedean Property of IR): For every x E R, there isn E N such that n > x. 


In other words, the Archimedean Property says that the set of natural numbers is unbounded in the 
reals. In particular, the set of natural numbers is not bounded from above in the set of real numbers. 
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We will prove this theorem by contradiction using the Completeness Property of the reals. If we 
(wrongly) assume that the set of natural numbers is bounded from above, then the Completeness 
Property of the reals gives us a least upper bound x. Since x is a least upper bound, x — 1 is not an 
upper bound. Do you see the problem yet? If x— 1 < n EN, then x <n + 1. But then x is not an 
upper bound for the set of natural numbers, contrary to our assumption. Let’s write out the details. 


Proof: Suppose toward contradiction that N is bounded from above. By the Completeness Property of 
R, x = sup N exists. Since x — 1 is not an upper bound for N, there ism E N such that x — 1 < n. Then 
we have x = x + (-1+1)=(x—1)+1< n+ 1. Since N is closed under addition, n + 1 E N. So, x 
is not an upper bound for N, contradicting the fact that x = sup N. It follows that N is not bounded 
from above. So, for every x E R, there isn E N such that n > x. o 


Theorem 5.10 (The Density Theorem): If x,y € R with x < y, then there is q E Q withx <q < y. 


In other words, the Density Theorem says that between any two real numbers we can always find a 
rational number. We say that Q is dense in R. 


To help understand the proof, let’s first run a simple simulation using a specific example. Let’s let 
x= Zand y= Z We begin by subtracting to get y — x = = This is the distance between x and y. We 
wish to find a natural number n such that 2 is smaller than this distance. In other words, we want 
2 < = or equivalently, n > 3. So, we can let n be any natural number greater than 3, say n = 4. We 
now want to “shift” 2 = to the right to get a rational number between x and y. We can do this as 


, : 16 64 7 
follows. We multiply n times x to get nx = 4- PU. We then let m be the least integer greater than 


nx.So,m = - = 22. Finally, we let q = = = = = = And we did it! Indeed, we have — < = < Z, The 


reader should confirm that these inequalities hold. Let’s write out the details of the proof. 


Proof: Let’s first consider the case where 0 < x < y. Let z = y — x = y + (- x). Since R has the 
additive inverse property and is closed under addition, z € R. Also, z > 0. By the Archimedean 


Property, there is n € N such that n > Z, Using Problem 5 (part (v)) in the problem set below, we have 
2 < Z. By the Archimedean Property once again, there is m E N such that m > nx. Therefore, = >x 
(Check this!). So, {m EN [2> x} + Ø. By the Well Ordering Principle, {m EN |=> x} has a least 
element, let’s call it k. Since k > 0, (because x > 0 and n > O) and k is the least natural number such 
that E> x, it follows that k — 1 € N and — < x, or equivalently, t- 2 < x. Therefore, we have 


E<x+ł<x+z=x+(y-x) = y. Thus, x < = < y. Since k,n € N, we have * € Q. 
Now, we consider the case where x < 0 and x < y. By the Archimedean Property, there is t E N such 
that t >- x. Then, we have 0 < x +t < y +t. So, x +t and y + t satisfy the first case above. Thus, 


there is q E Q with x+t <q < y +t. It follows that x < q —t < y. Since t E€ N, -t E Z. Since 
Z E Q, -t E Q. So, we have q, -t E Q. Since Q is closed under addition, q — t =q + (-t) E€ Q. oO 
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Problem Set 5 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 
LEVEL 1 


1. The addition and multiplication tables below are defined on the set S = {0,1,2}. Show that 
(S,+, -) does not define a field. 


lö a- 2 po a a 
0 0 1 2 0 0 0 0 
1 1 2 0 1 0 1 2 
2 2 0 1 2 0 2 2 


2. Let F = {0,1}, where 0 + 1. Show that there is exactly one field (F, +, -), where 0 is the 
additive identity and 1 is the multiplicative identity. 


LEVEL 2 


3. Let (F,+, -) bea field. Prove each of the following: 
(i) Ifa,b E F witha + b = b, then a = 0. 
(ii) Ifa E F,b € F*, andab = b, thena = 1. 
(iii) IfaeF,thena-0=0. 
Gv) Ifa € F*,b € F, and ab = 1, then b = Ż. 
(v) Ifa,b E F and ab = 0, then a = 0 or b = 0. 
(vi) IfaeF,then-a=-1a 
vid (“DCD =1. 


4. Let (F,+, -) bea field with N © F. Prove that Q E F. 


LEVEL 3 


5. Let (F, <) be an ordered field. Prove each of the following: 
(i) If a,b E F, exactly one of the following holds: a < b, a = b, ora > b. 
(ii) Ifa,b E F,a < b,andb < a, thena = b. 
aii)  Ifa,b,cEF,a< b,andb < c,thena <c. 
(iv) Ifa,b,cEF,a < b,andb <c,thena <c. 
(v) Ifa,b € F* anda > b, then= <= 
(vi) Ifa,be€F,thena > bifand only if-a < -b. 
(vii) Ifa,b€F,thena = bifand only if-a <-b. 


62 


6. Let (F,+, -) bea field. Show that (F, -) is a commutative monoid. 


LEVEL 4 


7. Prove that there is no smallest positive real number. 


8. Let a be a nonnegative real number. Prove that a = 0 if and only if a is less than every positive 
real number. (Note: a nonnegative means that a is positive or zero.) 


9. Prove that every rational number can be written in the form = where m E Z, n E Z*, and at least 


one of m or n is not even. 


LEVEL 5 


10. Show that every nonempty set of real numbers that is bounded below has a greatest lower bound 
in R. 


11. Show that between any two real numbers there is a real number that is not rational. 


12. Let T = {x E F | -2 < x < 2}. Prove supT = 2 and infT = -2. 


CHALLENGE PROBLEM 


13. Let V = {x E F | x? < 2} and let a = sup V. Prove that a? = 2. 
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LESSON 6 - TOPOLOGY 
THE TOPOLOGY OF R 


Intervals of Real Numbers 


A set I of real numbers is called an interval if any real number that lies between two numbers in I is 
also in J. Symbolically, we can write 


VxyEeElVzER(x<z<yozel). 


The expression above can be read “For all x, y in Z and all z € R, if x is less than z and z is less than y, 
then z is in J.” 
Example 6.1: 


1. The set A = {0,1} is not an interval. A consists of just the two real numbers 0 and 1. There are 
ye i EE 
infinitely many real numbers between 0 and 1. For example, the real number > satisfies 


0<-~<1,but-¢A. 
2 2 


2. R is an interval. This follows trivially from the definition. If we replace J by R, we get 
Vx,y €RVzE R(x <z < y >z E R). In other words, if we start with two real numbers, and 
take a real number between them, then that number is a real number (which we already said). 


When we are thinking of R as an interval, we sometimes use the notation (- œ, œ) and refer to this as 
the real line. The following picture gives the standard geometric interpretation of the real line. 


—3 -2 -I 0 1 2 3 


In addition to the real line, there are 8 other types of intervals. 


Open Interval: (a,b) ={xER|a<x<b} 

Closed Interval: la,bl={xERlasx<b} 

Half-open Intervals: (a,b]={xER]la<x<b} la,b) = {xER]|a<x< b} 
Infinite Open Intervals: (a,oo) = {x E R|x >a} (-œ,b)={xER|x< b} 
Infinite Closed Intervals: [a, 00) = {x E€ R| x =a} (-œ,b]={xeER|x <b} 


It’s easy to check that each of these eight types of sets satisfies the definition of being an interval. 
Conversely, every interval has one of these nine forms. This will follow immediately from Theorem 6.1 
and Problem 4 below. 


Note that the first four intervals above (the open, closed, and two half-open intervals) are bounded. 
They are each bounded below by a and bounded above by b. In fact, for each of these intervals, a is 
the greatest lower bound and b is the least upper bound. Using the notation from Lesson 5, we have 
for example, a = inf(a, b) and b = sup(a, b). 


64 


Example 6.2: 
1. The half-open interval (- 2,1] = {x E R|-2 < x < 1} has the following graph: 


—3 —2 -l1 0 2 3 
2. The infinite open interval (0,00) = {x € R | x > 0} has the following graph: 


—3 —2 -l 0 1 2 3 


Theorem 6.1: If an interval J is bounded, then there are a, b € R such that one of the following holds: 
I = (a,b), 1 = [a,b], I = (a,b], or I = [a,b). 


Analysis: We will prove this by letting a = inf I and b = sup! (in other words, a is the greatest lower 
bound of I and b is the least upper bound of J), and then doing each of the following: 
(1) We will show I E [a, b]. 
(2) We will show (a,b) CI. 
(3) We will then look at 4 different cases. As one sample case, if a,b € I, then we will have 
I & [a,b] and [a,b] £ 1. It then follows from the “Axiom of Extensionality” that J = [a, b]. 


Recall: Given sets X and Y, the Axiom of Extensionality says that X and Y are the same set if and only 
if X and Y have precisely the same elements (See the technical note following Theorem 2.5 in Lesson 
2). In symbols, 


X =Y ifandonlyif Vx(x EX Ox EY). 


Since Vx(x E X e x EY) is logically equivalent to Vx(x E X > x EY) AVx(x EY >x EX), we 
have 


X =Y if and only if Yx(x EX > x EY) andvVx(* EY ~x EX). 


Therefore, to show that X = Y, we can instead show that X © Y and Y & X. This is the approach we 
will take in the proof below. 


Proof of Theorem 6.1: Let 7 be a bounded interval. Since J is bounded, by the Completeness of R, J has 
a least upper bound b. By Problem 10 in Lesson 5, J has a greatest lower bound a. If x € I, then by the 
definitions of upper bound and lower bound, we have x € [a, b]. Since x was an arbitrary element of 
I,Vx(x E€ I > x E [a, b]). So, I S [a,b]. 


Now, let z E (a, b). It follows that a < z < b. Since b is the least upper bound of 1, z is not an upper 
bound of J. So, there is y E I with z < y. Since a is the greatest lower bound of J, z is not a lower 
bound of 1. So, there is x € I with x < zZ. Since I is an interval, x,y E I, and x < z < y, it follows that 
z €l. Since z was an arbitrary element of (a,b), we have shown Vx(x E (a,b) > x E I). So, 
(a,b) El. 


We have shown that (a,b) © I and I © [a, b]. There are now 4 cases to consider. 
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Case 1: If both the greatest lower bound of J (namely, a) and the least upper bound of I (namely, b) 
are elements of J, then we have [a,b] S I and I © [a, b]. So, I = [a,b]. 


Case 2: If a E I and b € 1, then we have [a,b) © I and7 & [a,b). So, I = [a,b). 
Case 3: If a ¢ I and b E I, then we have (a, b] S I and I € (a, b]. So, I = (a, b]. 
Case 4: Ifa ¢ I and b ¢ I, then we have (a,b) © I and! € (a,b). So, I = (a,b). Oo 


Note: You will be asked to prove the analogous result for unbounded intervals in Problem 4 below. 


Operations on Sets 
In Lesson 2 we saw how to take the union and intersection of two sets. We now review the definitions 
from that lesson and introduce a few more. 
The union of the sets A and B, written A U B, is the set of elements that are in A or B (or both). 


AUB={x|x €Aorx € B} 


The intersection of A and B, written A N B, is the set of elements that are simultaneously in A and B. 


ANB={x|xe€Aandx EB} 


The following Venn diagrams for the union and intersection of two sets can be useful for visualizing 
these operations. As usual, U is some “universal” set that contains both A and B. 


AUB ANB 


The difference A \ B is the set of elements that are in A and not in B. 


A\B={x|x e€Aandx ¢ B} 


The symmetric difference between A and B, written A A B, is the set of elements that are in A or B, 
but not both. 


AAB=(A\B)U(B\A) 


Let’s also look at Venn diagrams for the difference and symmetric difference of two sets. 
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A\B AAB 


Example 6.3: Let A = {0, 1,2, 3,4} and B = {3, 4,5, 6}. We have 


1. AUB = {0,1,2, 3, 4,5, 6} 

2. ANB = {3,4} 

3. A\ B= {0,1,2} 

4. B\ A= {5,6} 

5. AAB = {0,1,2} u {5,6} = {0,1, 2,5, 6} U 


Example 6.4: Let A = (- 2,1] and B = (0, œ). We have 
1. AUB = (-2,0) 
2. ANB = (0,1] 
3. A\B =(-2,0] 
4. B\ A= (1,0) 
5. AAB = (-2,0] U (1,0) 


Note: If you have trouble seeing how to compute these, it may be helpful to draw the graphs of A and 
B lined up vertically, and then draw vertical lines through the endpoints of each interval. 


The results follow easily by combining these graphs into a single graph using the vertical lines as guides. 
For example, let’s look at A N B in detail. We’re looking for all numbers that are in both A and B. The 
two rightmost vertical lines drawn passing through the two graphs above isolate all those numbers 
nicely. We see that all numbers between 0 and 1 are in the intersection. We should then think about 
the two endpoints 0 and 1 separately. 0 ¢ B and therefore, 0 cannot be in the intersection of A and 
B. On the other hand, 1 € A and 1 E B. Therefore, 1 € A N B. So, we see that AN B = (0,1). 
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Unions and intersections have many nice algebraic properties such as commutativity (A UB =BUA 
and ANB = B N A), associativity ((A UB) UC =AU(BUC) and (ANB)NC =AN(BNC)), and 
distributivity (AN (BUC) = (ANB)U(ANC) andAU(BNC) =(AUB)N(AUC)). 


As an example, let’s prove that the operation of forming unions is associative. You will be asked to 
prove similar results in the problems below. 


Theorem 6.2: The operation of forming unions is associative. 


Note: Before beginning the proof, let’s draw Venn diagrams of the situation to convince ourselves that 
the theorem is true. 


BUC 


U 


(Au B)UC=AU(BUC) 


Proof of Theorem 6.2: Let A, B, and C be sets, and let x E (AU B) U C. Thenx €EAUBorx EC. If 
x E C, then x E B or x E C. So, x E B U C. Then x E A or x E B U C. So, x E A U (B U C). If, on the 
other hand, x E AU B,thenxEAorxEB.IfxEA,thenxEAorxEBUC.SoxEAU(BUC). 
If x E B, then x E B or x E C. So, x E B U C. Then x E A or x E B U C. So, x E A U (B U C). Since x 
was arbitrary, we have shown vx(x E(AUB)UC> xEAU(BU C)). Therefore, we have shown 
tha (AUB)UCSAU(BUC). 


A similar argument can be used to show A U (B UC) © (AUB) UC (the reader should write out the 
details). 
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Since (AUB) UCGAU(BUC) and AU(BUC) € (AUB)UC, (AUB)UC=AU(BUC), and 
therefore, the operation of forming unions is associative. oO 


Remember that associativity allows us to drop parentheses. So, we can now simply write AU BUC 
when taking the union of the three sets A, B, and C. 


Recall from Lesson 2 that sets A and B are called disjoint or mutually exclusive if AM B = Ø. For 
example, the sets (—2, 0] and (1, œ) are disjoint intervals. Here is a typical Venn diagram of disjoint 
sets A and B. 


ANB=9 


In topology, we will often want to look at unions and intersections of more than two sets. Therefore, 
we make the following more general definitions. 
Let X be a nonempty set of sets. 
UX = {y | there is Y € X with y € Y} and NX = {y | forall Y E€ X,y € Y}. 
If you’re having trouble understanding what these definitions are saying, you’re not alone. The notation 
probably looks confusing, but the ideas behind these definitions are very simple. You have a whole 
bunch of sets (possibly infinitely many). To take the union of all these sets, you simply throw all the 
elements together into one big set. To take the intersection of all these sets, you take only the elements 
that are in every single one of those sets. 
Example 6.5: 
1. Let A and B be sets and let X = {A, B}. Then 
UX = {y | there is Y € X with y e Y} = {y |y EAory E B}=AUB. 
NX = {y | forall Y E€ X,y E Y} = {y | y E Aandy E B} = ANB. 
2. Let A, B, and C be sets, and let X = {A, B,C}. Then 
UX = {y | there is Y € X with y € Y} = {y |y EA, y E B,ory EC} =AUBUC. 
NX = {y | forall Y E X,y E Y}={y|y EA, y E BandyEC}=ANBNC. 
3. Let X = {[0,r) |r € R*}. Then 
UX = {y | there is Y € X with y € Y} = {y | there is r e R* with y € [0,r)} = [0, œ). 
NX = {y | for all Y € X,y € Y} = {y | for allr € Rt, y e [0,r)} = {0}. 
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Notes: (1) Examples 1 and 2 give a good idea of what UX and NX look like when X is finite. More 
generally, if X = {Aj, Az,...,A,}, then UX = A, UA, U + UA, and NX = Ay N4 A+ N Ap. 


(2) As a specific example of Note 1, let A, = (-0,5], A, = (0,5), Az = [2,6), and A, = (4, 99]. Let 
X = {A,, Az, A3, A4}. Then 

UX = A, UA, U A; UA, = (- œ, 5] U (0,5) U [2, 6) u (4,99] = (- œ, 99]. 

NX = A,NA,NAZNA, = (-~,5] N (0,5) N [2, 6) N (4, 99] = (4,5). 

If you have trouble seeing how to compute the intersection, it may help to line up the graphs of the 
intervals, as was done in the Note following Example 6.4, and/or take the intersections two at a time: 
(- 0,5] (0,5) = (0,5) because (0,5) & (~~, 5]. 
(0,5) A [2,6) = [2,5) (draw the line graphs if you don’t see this). 
[2,5) N (4,99] = (4,5) (again, draw the line graphs if you don’t see this). 
(3) Let’s prove carefully that {y | there is r € R* with y E€ [0,1r)} = [0, 0). 
For convenience, let’s let A = {y | there is r € R* with y € [0,r)}. 


Ify E A, then thereisr € R* with y € [0,r).So,0 < y <r. Inparticular, y => 0. So, y € [0, 00). Since 
y E A was arbitrary, we have shown that A © [0, 00). 


Let y E [0, œ). Since (y + 1) — y = 1 > 0, we have y + 1 > y. So, y E [0, y +1). Since y + 1 E€ R*, 
y E A. Since y E [0, œ) was arbitrary, we have shown that [0, œ) & A. 


Since A © [0, œ) and [0, œ) & A, it follows that A = [0, œ). 


(4) Let’s also prove carefully that {y | for allr € R*,y € [0,r)} = {0}. 
For convenience, let’s let B = {y | forall r € R*,y € [0,r)}. 


If y E B, then for all r € R*, y € [0,r). So, for all y E Rt, 0 < y <r. So, y is a nonnegative real 
number that is less than every positive real number. By Problem 8 in Problem Set 5, y = 0. Therefore, 
y E {0}. Since y E B was arbitrary, we have shown that B € {0}. 


Now, let y E {0}. Then y = 0. For allr € Rt, 0 € [0,r). So, y E B. It follows that {0} © B. 
Since B € {0} and {0} C B, it follows that B = {0}. 
(5) Note that the empty union is empty. Indeed, we have U@ = {y | there is Y € Ø with y E Y} = Ø. 


If X is a nonempty set of sets, we say that X is disjoint if NX = Ø. We say that X is pairwise disjoint if 
for all A,B € X with A + B, A and B are disjoint. For example, if we let X = {(n,n + 1) | n E€ Z}, then 
X is both disjoint and pairwise disjoint. 


Are the definitions of disjoint and pairwise disjoint equivalent? You will be asked to answer this 
question in Problem 5 below. 
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Open and Closed Sets 


A subset X of R is said to be open if for every real number x E X, there is an open interval (a, b) with 
x E (a,b) and (a,b) E X. 


In words, a set is open in R if every number in the set has “some space” on both sides of that number 
inside the set. If you think of each point in the set as an animal, then each animal in the set should be 
able to move a little to the left and a little to the right without ever leaving the set. Another way to 
think of this is that no number is on “the edge” or “the boundary” of the set, about to fall out of it. 
Example 6.6: 


1. Every bounded open interval is open. To see this, let X = (a, b) and let x E X. Then X = (a,b) 
itself is an open interval with x E (a, b) and (a,b) © X. For example, (0,1) and (- V2, =) are 
open sets. 


2. We will prove in the theorems below that all open intervals are open sets. For example, 
(- 2,00), (- 00,5), and (- œ, œ) are all open sets. 


3. (0,1] is not an open set because the “boundary point” 1 is included in the set. If (a, b) is any 
open interval containing 1, then (a, b) £ (0,1] because there are numbers greater than 1 inside 


(a,b). For example, let x = =(1 + b) (the average of 1 and b). Since b > 1, we have that 
x> =(1 +1)= =. 2 = 1. So, x > 1. Also, since 1 > a, x > a. Now, since 1 < b, we have that 
x < Ż(b +b) =} (2b) = (5-2)b = 1b = b. So, x € (a,b). 


4. We can use reasoning similar to that used in 3 to see that all half-open intervals and closed 
intervals are not open sets. 


Theorem 6.3: Let a € R. The infinite interval (a, œ) is an open set. 


The idea behind the proof is quite simple. If x E (a, œ), then (a,x + 1) is an open interval with x inside 
of it and with (a,x + 1) © (a, œ). 


Proof of Theorem 6.3: Let x E (a,o) andletb=x+1. 
Since x E (a,0), x >a. Since (x +1) -x =1>0,wehaveb=x+1>x. 


So, we havea < x < b. Thatis, x E (a,b). Also, (a, b) SE (a, œ). Since x € (a, œ) was arbitrary, (a, œ) 
is an open set. o 


In Problem 6 below (part (i)), you will be asked to show that an interval of the form (- œ, b) is also an 
open set. 


Theorem 6.4: Ø and R are both open sets. 


Proof: The statement that Ø is open is vacuously true (since Ø has no elements, there is nothing to 
check). 
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If x E R, then x E (x —1,x + 1) and (x —1,x +1) CR. Since x was an arbitrary element of R, we 
have shown that for every x E R, there is an open interval (a, b) with x E (a,b) and (a,b) © R. So, R 
is open. oO 


Many authors define “open” in a slightly different way from the definition we’ve been using. This next 
Theorem will show that the definition we have been using is equivalent to theirs. 


Theorem 6.5: A subset X of R is open if and only if for every real number x € X, there is a positive real 
number c such that (x —c,x +c) CX. 


Analysis: The harder direction of the proof is showing that if X is open, then for every real number 
x E X, there is a positive real number c such that (x —c,x +c) CX. 


To see this, suppose that X is open and let x € X. Then there is an open interval (a, b) with x € (a, b) 
and (a,b) © X. We want to replace the interval (a, b) by an interval that has x right in the center. 


The following picture should help us to come up with an argument. 
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x-a X b-x 


In the picture, we have an open interval (a, b), containing x. In this particular picture, x is a bit closer 
to a than it is to b. However, we should remember to be careful that our argument doesn’t assume this 
(as we have no control over where x “sits” inside of (a, b)). 


In the picture, we see that x — a is the distance from a to x, and b — x is the distance from x to b. 
Since the distance from a to x is smaller, let’s let c be that smaller distance. In other words, we let 
c = x — a. From the picture, it looks like the interval (x — c, x + c) will be inside the interval (a, b). 


In general, if x is closer to a, we would let c = x — a, and if x is closer to b, we would let c = b — x. 
We can simply define c to be the smaller of x — a and b — x. That is, c = min{x — a, b — x}. From the 
picture, it seems like with this choice of c, the interval (x — c,x + c) should give us what we want. 


Proof of Theorem 6.5: Let X be an open subset of R and let x € X. Then there is an open interval (a, b) 
with x E (a,b) and (a,b) € X. Let c = min{x — a,b — x}. We claim that (x — c, x + c) is an open 
interval containing x and contained in (a, b). We need to showa <x-—c<x<x+c<b. 
Since c = min{x — a,b — x}, c < x —a.So,-c 2 - (x — a). It follows that 

(x -—c)-a=(x-(x-a))-a=(x«-x+a)-a=a-a=0. 
Sox—c >a. 


Since c = min{x — a, b — x}, c < b — x. So, - c 2 - (b — x). It follows that 
b—-(x+c)=b-x-czb-x-(b-x)=0. 


So, b > x + c, or equivalently, x + c < b. 
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Note that x > a, so that x — a > 0, and x < b, so that b — x > 0. It follows that c > 0. 


We have x —(x—c)=c>0, so that x>x-—c. We also have (x+c)—x=c>0, so that 
x+c>x. 


We have showna <x—c <x <x +c < b, as desired. 


Since (x — c,x + c) S (a, b) and (a,b) € X, by the transitivity of SE (Theorem 2.3 from Lesson 2), we 
have (x -—c,x +c) EX. 


The converse is immediate since for x E X, (x — c,x + c) is an open interval containing x. oO 


The basic definition of a topological space involves open sets, unions, and intersections. We’re not 
going to talk about general topological spaces in this lesson (we will look at them in Lesson 14), but in 
the spirit of the subject, we will prove some results about unions and intersections of open sets in R. 


Theorem 6.6: The union of two open sets in Ris an open set in R. 


Proof: Let A and B be open sets in R, and let x € AUB. Then x E A or x E B. Without loss of 
generality, we may assume that x € A (see the Note below). Since A is open in R, there is an interval 
(a,b) with x E (a, b) and (a,b) © A. By Theorem 2.4, A € A UB. Since CG is transitive (Theorem 2.3), 
(a,b) © AUB. Therefore, A U B is open. Oo 


Note: In the proof of Theorem 6.6, we used the expression “Without loss of generality.” This expression 
can be used when an argument can be split up into 2 or more cases, and the proof of each of the cases 
is nearly identical. 


For Theorem 6.6, the two cases are (i) x € A and (ii) x € B. The argument for case (ii) is the same as 
the argument for case (i), essentially word for word—only the roles of A and B are interchanged. 


Example 6.7: (- 5, 2) is open by part 1 of Example 6.6 and (7, œ) is open by Theorem 6.3. Therefore, 
by Theorem 6.6, (- 5,2) U (7, ©) is also open. 


If you look at the proof of Theorem 6.6 closely, you should notice that the proof would still work if we 
were taking a union of more than 2 sets. In fact, any union of open sets is open, as we now prove. 


Theorem 6.7: Let X be a set of open subsets of IR. Then UX is open. 


Proof: Let X be a set of open subsets of R and let x E UX. Then x E A for some A E X. Since A is open 
in R, there is an interval (a, b) with x E (a,b) and (a, b) © A. By Problem 9 below (part (i)), we have 
A © UX. Since GC is transitive (Theorem 2.3), (a, b) © UX. Therefore, UX is open. oO 


Example 6.8: 
1. (1,2) U (2,3) U (3,4) U (4, œ) is open. 
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2. R\ Zis open because it is a union of open intervals. It looks like this: 
-+(-2,-1) U(-1,0) u (0,1) U (1,2) U + 
R \ Z can also be written as 
(Jenn +1)|n € Z}or [Jant 1) 
neZ 


) for positive integers n, we get an open 


. : 1 1 
3. If we take the union of all intervals of the form (—,- 


set. We can visualize this open set as follows: 


1 1 ae 11 11 11 1 
Uara et} =~ G3) ¥ G-3) ¥ (5-2) 8 4) 
Theorem 6.8: Every open set in IR can be expressed as a union of bounded open intervals. 


The main idea of the argument will be the following. Every real number that is in an open set is inside 
an open interval that is a subset of the set. Just take the union of all these open intervals (one interval 
for each real number in the set). 


Proof of Theorem 6.8: Let X be an open set in R. Since X is open, for each x € X, there is an interval 
(ax, by) with x E (ay, by) and (ax, by) E X.We Let Y = {(a,, by) | x E X}. We will show that X = UY. 


First, let x E X. Then x E (a,,b,,). Since (ax, by) E Y, x € UY. Since x was arbitrary, X © UY. 


Now, let x € UY. Then there is z E X with x E (a,,b,). Since (a,,b,) © X, x E X. Since x € X was 
arbitrary, UY & X. 


Since X © UY and UY EX, it follows that X = UY. o 
Theorem 6.9: The intersection of two open sets in R is an open set in R. 


Proof: Let A and B be open sets in R and let x € A N B. Then x € A and x E B. Since A is open, there 
is an open interval (a, b) with x E (a, b) and (a, b) © A. Since B is open, there is an open interval (c, d) 
with x E (c,d) and (c,d) S B. Let C = (a,b) A (c,d). Since x E (a,b) and x E (c,d), x EC. By 
Problem 6 below (part (ii)), C is an open interval. By Problem 11 from Lesson 2 and part (ii) of Problem 
3 below, C & A and C CB. It follows that C © ANB (Prove this!). Since x € A N B was arbitrary, 
ANB is open. oO 


In Problem 6 below (part (iii)), you will be asked to show that the intersection of finitely many open 
sets in IR is an open set in R. In problem 8, you will be asked to show that an arbitrary intersection of 
open sets does not need to be open. 


A subset X of R is said to be closed if IR \ X is open. 


R \ X is called the complement of X in R, or simply the complement of X. It consists of all real numbers 
not in X. 
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Example 6.9: 


1. Every closed interval is a closed set. For example, [0,1] is closed because its complement in R 
is R \ [0,1] = (- œ, 0) U (1, œ). This is a union of open intervals, which is open. 


Similarly, [3, 00) is a closed set because R \ [3, œ) = (- œ, 3), which is open. 


2. Half-open intervals are neither open nor closed. For example, we saw in Example 6.6 that (0,1] 
is not an open set. We see that (0,1] is not closed by observing R \ (0,1] = (- œ,0] U (1, œ), 
which is not open. 


3. Ø is closed because R \ Ø = Ris open. Ris closed because R \ R = @ is open. Ø and R are the 
only two sets of real numbers that are both open and closed. 


Theorem 6.10: The intersection of two closed sets in R is a closed set in R. 


Proof: Let A and B be closed sets in R. Then R \ A and R \ B are open sets in R. By Theorem 6.6 (or 
6.7), (R \ A) U (R \ B) is open in R. Therefore, R \ [(R \ A) U (R \ B)] is closed in R. So, it suffices 
to show that AN B = R\ [(R\ A) u (R\ B)]. Well, x € ANB if and only if x € A and x E B if and 
only if xé R\A and x¢ R\B if and only if xé (R\A)UCR\B) if and only if 
x E R\[(R\ A) U (R\B)]. So, ANB = R \ [(R \ A) u (R \ B)], completing the proof. Oo 


A similar argument can be used to show that the union of two closed sets in R is a closed set in R. This 
result can be extended to the union of finitely many closed sets in R with the help of Problem 6 below 
(part (iii)). The dedicated reader should prove this. In Problem 10 below, you will be asked to show that 
an arbitrary intersection of closed sets in R is closed. In problem 8, you will be asked to show that an 
arbitrary union of closed sets does not need to be closed. 
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Problem Set 6 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 


1. Draw Venn diagrams for (A \ B) \ C and A \ (B \ C). Are these two sets equal for all sets A, 
B, and C? If so, prove it. If not, provide a counterexample. 


2. LetA= fø, {ø, Ø}, B = {ø, {Ø}, C = (—œ,2], D = (—1,3]. Compute each of the following: 


G) AUB 
G) ANB 
(iii) A\B 
(iv) B\A 
(v) AAB 
(vi) CUD 
(vii) CAD 
(viii) C\ D 
(ix) D\C 
(x) CAD 


LEVEL 2 


3. Prove the following: 
(i) The operation of forming unions is commutative. 
(ii) The operation of forming intersections is commutative. 
(iii) The operation of forming intersections is associative. 


4. Prove that if an interval J is unbounded, then J has one of the following five forms: (a, °°), 
(- oO, b), la, 00), (- oO, b], (- oo, 00) 


LEVEL 3 


5. Prove or provide a counterexample: 
(i) Every pairwise disjoint set of sets is disjoint. 


(ii) Every disjoint set of sets is pairwise disjoint. 
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6. Prove the following: 
(i) Forall b € R, the infinite interval (- ©, b) is an open set in R. 
(ii) The intersection of two open intervals in R is either empty or an open interval in R. 


(iii) The intersection of finitely many open sets in R is an open set in R. 


7. Let A,B, and C be sets. Prove each of the following: 
G) AN(BUC)=(ANB)U(ANC). 
(ii) AU(BNC)=(AUB)N(AUC). 
(iii) C\(AUB)=(C\A)N(C\B). 
(iv) C\(ANB)=(C\A)UC(C\B). 


LEVEL 4 


8. Give an example of an infinite collection of open sets whose intersection is not open. Also, give 
an example of an infinite collection of closed sets whose union is not closed. Provide a proof for 
each example. 

9. Let X be a nonempty set of sets. Prove the following: 

(i) Forall A E€ X, A S UX. 
Gi) Forall A € X, NX CA. 


LEVEL 5 


10. Prove that if X is a nonempty set of closed subsets of R, then NX is closed. 


11. Let A be a set and let X be a nonempty collection of sets. Prove each of the following: 
G) ANUX=U{ANB|BeExX} 
Gi) AUNX=N{AUB|BEX} 
(iii) A\ UX =N{A\ B|BeE xX} 
(iv) A\NX =U{A\ B|B eX}. 


12. Prove that every closed set in R can be written as an intersection NX, where each element of X 
is a union of at most 2 closed intervals. 


CHALLENGE PROBLEM 


13. Prove that every nonempty open set of real numbers can be expressed as a union of pairwise 
disjoint open intervals. 
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LESSON 7 - COMPLEX ANALYSIS 
THE FIELD OF COMPLEX NUMBERS 


A Limitation of the Reals 


In Lesson 5 we asked (and answered) the question “Why isn’t Q (the field of rational numbers) 
enough?” We now ask the same question about R, the field of real numbers. 


A linear equation has the form ax + b = 0, where a + O. If we are working inside a field, then this 


; ‘ . ` b , 
equation has the unique solution x = - ba™t = ar For example, the equation 2x — 1 = 0 has the 


. ; 2 1 f . RE TE ; 
unique solution x = 271 = = Notice how important it is that we are working inside a field here. If we 


were allowed to use only the properties of a commutative ring, then we might not be able to solve this 
equation. For example, in Z (the ring of integers), the equation 2x — 1 = 0 has no solution. 


A quadratic equation has the form ax? + bx + c = 0, where a + 0. Is working inside a field enough 
to solve this equation? The answer is no! For example, a solution to the equation x? — 2 = 0 must 
satisfy x? = 2. In Lesson 5, we proved that this equation cannot be solved in Q. This was one of our 
main motivations for introducing R. And, in fact, the equation x? — 2 = 0 canbe solved in R. However, 
the equation x? + 1 = 0 cannot be solved in R. This follows immediately from Theorem 5.2, which 
says that if x is an element of an ordered field, then x? = x - x can never be negative. 


Is there a field containing IR, where all quadratic equations can be solved? The answer is yes, and in 
fact, we can do much better than that. In this lesson we will define a field containing the field of real 
numbers such that every equation of the form a,x” + An_4x" + +++ + a,x + Ay = 0 has a solution. 
Such an equation is called a polynomial equation, and a field in which every such polynomial equation 
has a solution is called an algebraically closed field. 


The Complex Field 


The standard form of a complex number is a + bi, 


where a and b are real numbers. So, the set of 3 Y 
complex numbers is C = {a + bi | a,b € R}. 
e (-3,2) 2 e (1,2) 
If we identify 1 = 1 + Oi with the ordered pair (1, 0), 
and we identify i = 0 + 1i with the ordered pair 
(0, 1), then it is natural to write the complex number 1-¢-(0,1) 
a+ bi as the point (a,b). Here is a reasonable (2,0) 
justification for this: < H t 0 t + = 
-3 -2 -1 1 2 3 xX 
a + bi = a(1,0) + b(0,1) = (a, 0) + (0,b) = (a,b) sal 
In this way, we can visualize a complex number as a (-1,-2) 
point in The Complex Plane. A portion of the Complex e-2 
Plane is shown to the right with several complex (1,-3) 
numbers displayed as points of the form (x, y). -3 e” 
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The complex plane is formed by taking two copies of the real line and placing one horizontally and the 
other vertically. The horizontal copy of the real line is called the x-axis or the real axis (labeled x in the 
above figure) and the vertical copy of the real line is called the y-axis or imaginary axis (labeled y in 
the above figure). The two axes intersect at the point (0, 0). This point is called the origin. 


We can also visualize the complex number a + bi as a directed line 
segment (or vector) starting at the origin and ending at the point 
(a, b). Three examples are shown to the right. 


If z = a + bi is a complex number, we call a the real part of z and b 
the imaginary part of z, and we write a = Re z and b = Imz. 


Two complex numbers are equal if and only if they have the same real 
part and the same imaginary part. In other words, 


a+ bi =c + di if and only ifa =c and b = d. 


We add two complex numbers by simply adding their real parts and adding their imaginary parts. So, 
(a+ bi)+(c+di)=(a+c)+(b+d)i. 
As a point, this sum is (a + c, b + d). We can visualize this sum as the vector starting at the origin that 


is the diagonal of the parallelogram formed from the vectors a + bi and c + di. Here is an example 
showing that (1 + 2i) + (-3 +i) =-2 + 3i. 


The definition for multiplying two complex numbers is a bit more complicated: 
(a + bi)(c + di) = (ac — bd) + (ad + bc)i. 


Notes: (1) If b = 0, then we call a+ bi = a + Oi =a a real number. Note that when we add or 
multiply two real numbers, we always get another real number. 


(a+ 0i) + (b+ 01) = (a+b) + (0+ 0)i = (a+b) +0i =a+tb. 
(a + 0i)(b + 0i) = (ab —0-0) + (a: 0+ Ob)i = (ab — 0) + (0 + 0)i = ab + 0i = ab. 


(2) If a = 0, then we calla + bi = 0 + bi = bi a pure imaginary number. 


(3) i? =-1. To see this, note that i? =i-i = (0 + 1i)(0 + 1i), and we have 
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(0+ 11)(0 + 11) = (0-0-1-1) + (0-141-0)i = (0-1) + (0+ 0)i =-14+0i =-1. 


(4) The definition of the product of two complex numbers is motivated by how multiplication should 
behave in a field, together with replacing i? by - 1. If we were to naively multiply the two complex 
numbers, we would have 


(a+ bi)(c + di) = (a+ bi)c + (a + bi)(di) = ac + bci + adi + bdi? 
= ac + bci + adi + bd(-1) = ac + (bc + ad)i — bd = (ac — bd) + (ad + be)i. 


The dedicated reader should make a note of which field properties were used during this computation. 
Those familiar with the mnemonic FOIL may notice that “FOILing” will always work to produce the 
product of two complex numbers, provided we replace i? by - 1 and simplify. 
Example 7.1: Let z = 2 — 3i and w = - 1 + 5i. Then 
z+w = (2-3i)+(-1+5i)=(2+(-1))+ (3+5) =1+2i. 
zw = (2 —3i)(-1+ 5i) = (2(- 1) - (-3)(5)) + (2-5 + C3)CD)i 
= (-2 + 15) + (10 + 3)i = 13 + 13i. 


With the definitions we just made for addition and multiplication, we get (C, +, -), the field of complex 
numbers. See Lesson 5 if you need to review the definition of a field. 


Theorem 7.1: (C, +, -) is field. 
The proof that (C, +, -) is a field is very straightforward and mostly uses the fact that (R, +, -) is a 
field. For example, to verify that addition is commutative in C, we have 

(a+ bi)+(c+di) = (a+c)+(b+d)i=(c+a)+ (d+ b)i = (c + di) + (a + bi). 


We have a + c = c + a because a,c € R and addition is commutative in R. For the same reason, we 
haveb +d =d +b. 


We leave the full verification that (C, +, -) is a field as an exercise for the reader (Problem 2 below), 
and simply note a few things of importance here: 

e The identity for addition is 0 = 0 + Oi. 

e The identity for multiplication is 1 = 1 + Oi 

e The additive inverse of z = a + bi is -z = - (a + bi) = -a — bi. 


a 
e The multiplicative inverse of z = a + bi is z7! = —— — —— i. 
e multiplicative inverse o + bi is ae ae 


The reader is expected to verify all this in Problem 2. 


Remark: By Note 1 above, we see that (R, +, -) is a subfield of (C, +, -). That is, R © Cand (R, +, -) 
is a field with respect to the field operations of (C, +, -) (In other words, we don’t need to “change” 
the definition of addition or multiplication to get the appropriate operations in R—the operations are 
already behaving correctly). Subfields will be covered in more detail in Lesson 11. 
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Subtraction: If z, w E C, with z = a+ bi and w = c + di, then we define the difference z — w by 
Z—-w=2Z+(-w) =(a+bi) + (-c-—di) = (a-—c)+(b-a)i. 


As a point, this difference is (a — c, b — d). Here is an example illustrating how subtraction works using 
the computation (1 + 2i) — (2 —i) =-1+3i. 


Observe how we first replaced 2 — i by - 2 + i so that we could change the subtraction problem to the 
addition problem: (1 + 2i) + (-2 +i). We then formed a parallelogram using 1 + 2i and -2 +i as 
edges, and finally, drew the diagonal of that parallelogram to see the result. 


Division: If z E€ C and w E C* with z = a+ bi and w = c + di, then we define the quotient Z by 


Zo aL ; c d ac+bd bc-ad, 
won a ae a a a 


The definition of division in a field unfortunately led to a messy looking formula. However, when 
actually performing division, there is an easier way to think about it, as we will see below. 


The conjugate of the complex number z = a + bi is the complex number Z = a — bi. 


Notes: (1) To take the conjugate of a complex number, we simply negate the imaginary part of the 
number and leave the real part as it is. 


(2) lf z = a + bi + 0, then at least one of a or b is not zero. It follows that Z = a — bi is also not 0. 


(3) The product of a complex number with its conjugate is always a nonnegative real number. 
Specifically, if z = a + bi, then zz = (a + bi) (a — bi) = (a? + b?) + (-ab + ab)i =a? + b?. 


(4) We can change the quotient — to standard form by multiplying the numerator and denominator by 


w. So, if z = a + bi and w = c + di, then we have 
z zw (a+bi)(c-di) (act+tbd)+(be—ad)i_ ac+bd bc-ad. 


w ww (c+di(c— di) c? + d? sed ra" 
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Example 7.2: Let z = 2 — 3i and w = - 1 + 5i. Then 


Z=2+3i. Ww =-1—5i. 
2 w O- Wes) Coie Ci7=7). 47 7. 
w ww (-1+5)(--1—5)) ip +52 =+ w oa" 


Recall from Lesson 5 that in an ordered field, if a > 0 and b > 0, then a + b > 0 (Order Property 1) 
and ab > 0 (Order Property 2). Also, for every element a, exactly one of the following holds: a > 0, 
a = 0,ora < 0 (Order Property 3). 


Theorem 7.2: The field of complex numbers cannot be ordered. 
Proof: Suppose toward contradiction that < is an ordering of (C, +, -). 
Ifi >0,then-1=¿°=i-i>0 by Order Property 2. 


If i < 0, then -i > 0, and therefore, - 1 = P = CD(-1)i- i= (1D 1) = CdD(-D > 0, again by 
Order Property 2. 


So, -1 > 0 and it follows that 1 = (-1)(-1) > 0, again by order property 2. Therefore, we have 
-1>0and1 > 0, violating Order Property 3. So, (C, +, -) cannot be ordered. oO 
Absolute Value and Distance 


If x and y are real or complex numbers such that y = x”, then we call x a square root of y. If x isa 
positive real number, then we say that x is the positive square root of y and we write x = Jy. 


For positive real numbers, we will use the square root symbol only for the positive square root of the 
number. For complex numbers, we will use the square root symbol for the principal square root of the 
number. The concept of principal square root will be explained in Lesson 15. 


Example 7.3: 
1. Since 2? = 4, 2 E R, and 2 > 0, we see that 2 is the positive square root of 4 and we write 
2 = V4. 


2. We have (- 2)? = 4, but -2 < 0, and so we do not write - 2 = V4. However, - 2 is still a square 
root of 4, and we can write - 2 = - V4. 


3. Since i? = - 1, we see that i is a square root of - 1. 
4. Since (- i)? = (-i)(- D = (-1)(- 1)i? = 1(- 1) = - 1, we see that - i is also a square root of 
-1. 


5 (1+? =(1+D(1+i)=(1—1)+(1+1)i= 0 + 2i = 2i.So,1 + iisa square root of 2i. 


The absolute value or modulus of the complex number z = a + bi is the nonnegative real number 


Iz| = Ja? + b? =./(Rez)? + (Im z)? 
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Note: If z = a + Oi = ais areal number, then |a| = Va?. This is equal to a ifa > 0 and-aifa <0. 


For example, |4| = V42 = V16 = 4 and |- 4| = /(-4)2 = V16 = 4 = - (- 4). 


The statement “|a| = -a for a < 0” often confuses students. This confusion is understandable, as a 
minus sign is usually used to indicate that an expression is negative, whereas here we are negating a 
negative number to make it positive. Unfortunately, this is the simplest way to say, “delete the minus 
sign in front of the number” using basic notation. 


Geometrically, the absolute value of a complex number z is the distance between the point z and the 
origin. 


Example 7.4: Which of the following complex numbers is closest to the origin? 1 + 2i, -3 +i, or 
-2+ 3i? 
|1 + 2il = V12 +22 =v1+4=V5 

l-3 +i] = y(-3)2 + 1? = V9 + 1 = v10 

|-2+ 3i] =/(-2)2 +32 = V4 +9 = V13 
Since V5 < V10 < v13, we see that 1 + 2i is closest to the origin. 
Notes: (1) Here we have used the following theorem: If a,b € R+, then a < b if and only if a? < b?. 
To see this, observe that a? < b? if and only if b? — a? > 0 if and only if (b + a)(b — a) > 0. Since 


a > Oandb > 0, by Order Property 1, a + b > 0. It follows that a? < b? if and only if b — a > O if and 
only if b > a if and only if a < b. 


Applying this theorem to 5 < 10 < 13, we get V5 < V10 < v13. 
(2) The definition of the absolute value of a complex number is motivated by the Pythagorean Theorem. 


As an example, look at -3 + i in the figure below. Observe that to get from the origin to the point 
(- 3,1), we move to the left 3 units and then up 1 unit. This gives us a right triangle with legs of lengths 


3 and 1. By the Pythagorean Theorem, the hypotenuse has length V32? + 1? = V9 + 1 = v10. 
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The distance between the complex numbers z = a+ bi andw = c + di is 
d(z,w) = |z—w|=+/(c—a)* + (d — b)?. 


Geometrically, we can translate the vector z — w so that the directed line segment begins at the 
terminal point of w and ends at the terminal point of z. Let’s look one more time at the figure we drew 
for (1 + 2i) — (2 — i) = - 1 + 3i and then translate the solution vector as we just suggested. 


Notice that the expression for the distance between two complex numbers follows from a simple 
application of the Pythagorean Theorem. Let’s continue to use the same example to help us see this. 


In the figure above, we can get the lengths of the legs of the triangle either by simply counting the 
units, or by subtracting the appropriate coordinates. For example, the length of the horizontal leg is 
2 —1 = 1 andthe length of the vertical leg is 2 — (-1) = 2 + 1 = 3. We can then use the Pythagorean 


Theorem to get the length of the hypotenuse of the triangle: c = v1? + 3? = v1 +9 = v10. 
Compare this geometric procedure to the formula for distance given above. 
While we’re on the subject of triangles, the next theorem involving arbitrary triangles is very useful. 


Theorem 7.3 (The Triangle Inequality): For all z, w E C, |z+w| < |z| + |wl. 
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Geometrically, the Triangle Inequality says that the length of the third side of a triangle is less than or 
equal to the sum of the lengths of the other two sides of the triangle. We leave the proof as an exercise 
(see Problem 4 below). 


As an example, let’s look at the sum (1 + 2i) + (-3 +i) =-2 + 3i. In Example 7.4, we computed 
|1 + 2il = v5, |-3 + il = V10, and |-2 + 3i| = V13. 
Note that V5 + V10 > V4 + V9 = 2+ 3 = 5, whereas V13 < V16 = 4. So, we see that 
(1+ 21) + (3 +| = |-2 + 3i] = V13 < 4 < 5 < V5 + V10 = |1 + 2il + |-3 + il. 


In the following picture, there are two triangles. We’ve put dark bold lines around the leftmost triangle 
and labeled the sides with their lengths. 


Basic Topology of C 


A circle in the Complex Plane is the set of all points that are at a fixed distance from a fixed point. The 
fixed distance is called the radius of the circle and the fixed point is called the center of the circle. 


If a circle has radius r > 0 and center c = a + bi, then any point z = x + yi on the circle must satisfy 
|z — c| = r, or equivalently, (x — a)? + (y — b}? = r?. 


Note: The equation |z — c| = r says “The distance between z and c is equal to r.” In other words, the 
distance between any point on the circle and the center of the circle is equal to the radius of the circle. 


Example 7.5: The circle with equation |z + 2 — i| = 2 has 
center c = - (2 — i) = -2 + i and radiusr = 2. 


Note: |z + 2—i| = |z — (-2 + i)|. So, if we rewrite the 
equation as |z — (-2 + i)| = 2, it is easy to pick out the 
center and radius of the circle. 


A picture of the circle is shown to the right. The center is 
labeled and a typical radius is drawn. 
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An open disk in C consists of all the points in the interior of a circle. If a is the center of the open disk 
and r is the radius of the open disk, then any point z inside the disk satisfies |z —a| < r. 


N,(a) ={zE€C||z-—al<r} is also called the 
r-neighborhood of a. 


Example 7.6: N,(-2 +i) = {z E€ C | |z + 2 — i| < 2} is the 
2 neighborhood of - 2 + i. It consists of all points inside the 
circle |z + 2 —i| = 2. 


Notes: (1) A picture of the 2-neighborhood of -2 +i is 
shown to the right. The center is labeled and a typical radius 
is drawn. We drew the boundary of the disk with dashes to 
indicate that points on the circle are not in the 
neighborhood and we shaded the interior of the disk to 
indicate that every point inside the circle is in the 
neighborhood. 


(2) The definitions of open disk and r-neighborhood of a also make sense in R, but the geometry looks 
a bit different. An open disk in R is simply an open interval. If x and a are real numbers, then we have 


x E N,(a) © |x -al <reV(x-a)?<re0d<(x-a)? <r? 
@e-r<x-a<rea-r<x<atrexeée(a—-ratr). 
So, in R, an r-neighborhood of a is the open interval N,(a) = (a — r,a + r). Notice that the length 


(or diameter) of this interval is 2r. 


As an example, let’s draw a picture of N,(1) = (1 — 2,1 + 2) = (-1,3). Observe that the center of 
this open disk (or open interval or neighborhood) in R is the real number 1, the radius of the open disk 
is 2, and the diameter of the open disk (or length of the interval) is 4. 


—3 —2 -I 0 1 2 


A closed disk is the interior of a circle together with the circle itself (the boundary is included). If a is 
the center of the closed disk and r is the radius of the closed disk, then any point z inside the closed 
disk satisfies |z — al <r. 


Notes: (1) In this case, the circle itself would be drawn solid to indicate that all points on the circle are 
included. 


(2) Just like an open disk in R is an open interval, a closed disk in R is a closed interval. 


(3) The reader is encouraged to draw a few open and closed disks in both C and R, and to write down 
the corresponding sets of points using set-builder notation and, in the case of R, interval notation. 
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A punctured open disk consists of all the points in the interior of a circle except for the center of the 
circle. If a is the center of the punctured open disk and r is the radius of the open disk, then any point 
z inside the punctured disk satisfies |z — a| < r and Z + a. 


Note that z + ais equivalent to z — a # 0. In turn, this is equivalent to |z — a| + 0. Since |z — a| must 
be nonnegative, |z — a| # 0 is equivalent to |z — a| > 0 or 0 < |z — al. 
Therefore, a punctured open disk with center a and radius r consists of all points z that satisfy 


0<|z-al<r. 
N©(a) = {z | 0 < |z—a| < r} is also called a deleted r-neighborhood of a. 


Example 7.7: N°(-2 +i) = {zE C|0< |z+2-i| < 2} 
is the deleted 2 neighborhood of - 2 + i. It consists of all 
points inside the circle |z + 2 — i| = 2, except for - 2 + i. 


Notes: (1) A picture of the deleted 2-neighborhood of 
-2 +i is shown to the right. Notice that this time we 
excluded the center of the disk - 2 + i, as this point is not 
included in the set. 


(2) In R, we have 
N©(a) = (a -r,a +r) \ {a} = (a — r,a) U (apa +r). 


This is the open interval centered at a of length (or diameter) 2r with a removed. 


Let’s draw a picture of NS (1) = (-1,3) \ {1} =(-1,1) U (1,3). 


—3 —2 -I 0 1 2 


(3) Notice how all the topological definitions we are presenting make sense in both R and C, but the 
geometry in each case looks different. You will continue to see this happen. In fact, these definitions 
make sense for many, many sets and structures, all with their own “look.” In general, topology allows 
us to make definitions and prove theorems that can be applied very broadly and used in many (if not 
all) branches of mathematics. 


A subset X of C is said to be open if for every complex number z E X, there is an open disk D with 
zEDandDEX. 


In words, a set is open in C if every point in the set has “space” all around it inside the set. If you think 
of each point in the set as an animal, then each animal in the set should be able to move a little in any 
direction it chooses without leaving the set. Another way to think of this is that no number is right on 
“the edge” or “the boundary” of the set, about to fall out of it. 
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Example 7.8: 


1. Every open disk D is an open set. To see this, simply observe that if z € D, then D itself is an 
open disk with z E D and D CD. 


2. A closed disk is not an open set because it contains its “boundary.” As an example, let’s look at 
the closed unit disk D = {z € C | |z| < 1}. Let’s focus on the point i. First note that i € D 
because |i| = V02 + 12 = V1 = 1 and 1 < 1. Now, any open disk N containing i will contain 
points above i. Let’s say (1+e)i E N for some positive real number e. Now, we have 


Id + e)i] = 402 + (1 +€)? = 1 +e, which is greater than 1. Therefore, (1+ e)i ¢ D. It 
follows that N £ D, and so, D is not open. 


3. We can use reasoning similar to that used in 2 to see that if we take any subset of a disk that 
contains any points on the bounding circle, then that set will not be open. 


4. Ø and C are both open. You will be asked to prove this in Problem 7 below (parts (i) and (ii)). 


As we mentioned in Lesson 6 right before Theorem 6.5, many authors define “open” in a slightly 
different way from the definition we’ve been using. Once again, let’s show that the definition we have 
been using is equivalent to theirs. 


Theorem 7.4: A subset X of C is open if and only if for every complex number w E X, there is a positive 
real number d such that Ng(w) E X. 


Analysis: The harder direction of the proof is showing that 
if X is open, then for every complex number w E X, there is 
a positive real number d such that Ng (w) © X. 


To see this, suppose that X is open and let w € X. Then 
there is an open disk D = {z E€ C | |z — a| < r} withw E D 
and D & X. We want to replace the disk D with a disk that 
has w right in the center. 


To accomplish this, we let c be the distance from w to a. / 
Then r — c is the distance from w to the boundary of D. We \ 

will show that the disk with center w and radius r — c is a f 
subset of D. N 


The picture to the right illustrates this idea. Notice that 
c + (r — c) = r, the radius of disk D. 


Proof of Theorem 7.4: Let X be an open subset of C and let w E X. Then there is an open disk D with 
wé€DandDCxXx. 


Suppose that D has center a and radius r. So, D = {z € C| |z—a| <r}. 
Let c = |w —a| and let d = r — c. We will show that Ng(w) E D. 


Let z E Na(w). Then |z- w| <d=r-c. 
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By the Triangle Inequality (and SACT—see Note 2 below), 


|z—al = |(z-w) + (w-a)| < |z-wl|+|w-al<(r-c)+c=r. 
So, z € D. Since z was an arbitrary element of Ng(w), we showed that Nyg(w) © D. 


So, we have Na(w) © D and D C X. By the transitivity of © (Theorem 2.3 from Lesson 2), we have 


The converse is immediate since for w E X, Ng(w) is an open disk containing w. Oo 


Notes: (1) The picture to the right shows how we used the Triangle 
Inequality. The three sides of the triangle have lengths |z — wl, 
|w — al, and |z — al. 


(2) Notice how we used SACT (the Standard Advanced Calculus Trick) 
here. Starting with z—a, we wanted to make z—w and w—a 
“appear.” We were able to do this simply by subtracting and then | 
adding w between z and a. We often use this trick when applying the 
Triangle Inequality. SACT was introduced in Lesson 4 (Note 7 following 
Example 4.5). \ 


(3) The same proof used here can be used to prove Theorem 6.5. The geometry looks different (disks 
and neighborhoods are open intervals instead of the interiors of circles, and points appear on the real 
line instead of in the complex plane), but the argument is identical. Compare this proof to the proof we 
used in Theorem 6.5. 


A subset X of C is said to be closed if C \ X is open. 


C \ X is called the complement of X in C, or simply the complement of X. It consists of all complex 
numbers not in X. 


Example 7.9: 


1. Every closed disk is a closed set. For example, D = {z € C | |z| < 1} is closed because its 
complement in Cis C \ D = {z € C | |z| > 1}. You will be asked to prove that this set D is open 
in Problem 7 below (part (iii)). 


2. If we take any subset of a closed disk that includes the interior of the disk, but is missing at least 
one point on the bounding circle, then that set will not be closed. You will be asked to prove 
this for the closed unit disk {z € C | |z| < 1} in Problem 10 below. 


3. Ø is closed because C \ Ø = C is open. C is closed because C \ C = Ø is open. Ø and C are the 
only two sets of complex numbers that are both open and closed. 
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Problem Set 7 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 
LEVEL 1 


1. Let z=-4—i and w = 3 — 5i. Compute each of the following: 


(i) ztw 
(li) zw 
(iii) Imw 


(iv) 2z-—w 


(vy) w 
(vi) 5 
(vii) |2| 


(viii) the distance between z and w 


LEVEL 2 
2. Prove that (C,+, -) is field. 


3. Let z and w be complex numbers. Prove the following: 
G) Rez= 


(ii) Imz=— 


(iii) z+w=Z+w 
(iv) 2zw=Z-w 


vV ()=5 


(vi) zz=|z|? 


(vii) [zw] = [z||w| 


(viii) If w + 0, then 


a 
W 
(ix) Rez < |z| 


(x) Imz< |z| 
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LEVEL 3 


4. 


5. 


Prove the Triangle Inequality (Theorem 7.3). 
Let z and w be complex numbers. Prove liz! — Iwl] <|z+wļ|< |z| + |wI. 


A point w is an accumulation point of a set S of complex numbers if each deleted neighborhood 
of w contains at least one point in S. Determine the accumulation points of each of the following 
sets: 


D Ẹlrez) 
(ii) {- | ne z+} 
(iii) {i |n€ Z*} 
(iv) (Z |ne zt} 
(v) {zllzl< 1 
(vi) {z|0< |z—-2| <3} 


LEVEL 4 


7. 


8. 


Determine if each of the following subsets of C is open, closed, both, or neither. Give a proof in 
each case. 


(i) @ 

(ii) C 

(iii) {zEC| |z| > 1} 

(iv) {z e€ C|Imz < —2} 

v) {i"|neEZ*} 

(vi {zE€C|2<|z-2| < 4} 


Prove the following: 
(i) An arbitrary union of open sets in C is an open set in C. 
(ii) A finite intersection of open sets in C is an open set in C. 
(iii) An arbitrary intersection of closed sets in C is a closed set in C. 
(iv) A finite union of closed sets in C is a closed set in C. 


(v) Every open set in C can be expressed as a union of open disks. 
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LEVEL 5 


9. Acomplex number z is an interior point of a set S of complex numbers if there is a neighborhood 
of z that contains only points in S, whereas w is a boundary point of S if each neighborhood of 
w contains at least one point in S and one point not in S. Prove the following: 


(i) A set of complex numbers is open if and only if each point in S is an interior point of S. 
(ii) A set of complex numbers is open if and only if it contains none of its boundary points. 
(iii) A set of complex numbers is closed if and only if it contains all its boundary points. 

10. Let D = {z E€ C | |z| < 1} be the closed unit disk and let S be a subset of D that includes the 


interior of the disk but is missing at least one point on the bounding circle of the disk. Show that 
S is not a closed set. 


11. Prove that a set of complex numbers is closed if and only if it contains all its accumulation points. 
(See Problem 6 for the definition of an accumulation point.) 


12. Prove that a set consisting of finitely many complex numbers is a closed set in C. (Hint: Show 
that a finite set has no accumulation points.) 
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LESSON 8 - LINEAR ALGEBRA 
VECTOR SPACES 


Vector Spaces Over Fields 


Recall the following: 


1. 


3. 


In previous lessons, we looked at three structures called fields: Q (the field of rational numbers), 
R (the field of real numbers), and C (the field of complex numbers). Each of these fields come 
with two operations called addition and multiplication. Also, Q is a subfield of R and R is a 
subfield of C. This means that every rational number is a real number, every real number is a 
complex number, and addition and multiplication in Q, R, and C all work the same way. 


Fields have a particularly nice structure. When working in a field, we can perform all the 
arithmetic and algebra that we remember from elementary and middle school. In particular, 
we have closure, associativity, commutativity, identity elements, and inverse properties for 
both addition and multiplication (with the exception that 0 has no multiplicative inverse), and 
multiplication is distributive over addition. 


The standard form of a complex number is a + bi, where a and b are real numbers. We add 
two complex numbers using the rule (a + bi) + (c+ di) = (a+c)+(b+d)i. 


To give some motivation for the definition of a vector space, let’s begin with an example. 


Example 8.1: Consider the set C of complex numbers together with the usual definition of addition. 
Let’s also consider another operation, which we will call scalar multiplication. For each k € R and 
z = a + bi E C, we define kz to be ka + kbi. 


The operation of scalar multiplication is a little different from other types of operations we have looked 
at previously because instead of multiplying two elements from C together, we are multiplying an 
element of R with an element of C. In this case, we will call the elements of R scalars. 


Let’s observe that we have the following properties: 


1. 


(C, +) is a commutative group. In other words, for addition in C, we have closure, associativity, 
commutativity, an identity element (called 0), and the inverse property (the inverse of a + bi 
is -a — bi). This follows immediately from the fact that (C, +, -) is a field. When we choose to 
think of C as a vector space, we will “forget about” the multiplication in C, and just consider C 
together with addition. In doing so, we lose much of the field structure of the complex numbers, 
but we retain the group structure of (C, +). 


C is closed under scalar multiplication. That is, for all k € R and z E C, we have kz E C. To 
see this, let z = a + bi € C and let k E R. Then, by definition, kz = ka + kbi. Since a,b E R, 
and R is closed under multiplication, ka € R and kb E R. It follows that ka + kbi E C. 


1z = Z. To see this, consider 1 € R and let z = a + bi € C. Then, since 1 is the multiplicative 
identity for R, we have 1z = 1a + 1bi =a + bi =z. 
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4. Forall j,k € Rand z € C, (jk)z = j(kz) (Associativity of scalar multiplication). To see this, 
let j,k E Rand z = a + bi E C. Then since multiplication is associative in R, we have 


(jk)z = (jk)(a + bi) = (jk)a + (jk) bi = j(ka) + j(kb)i = j(ka + kbi) = j(kz). 


5. Forallk € Randz,w E C,k(z+w) = kz + kw (Distributivity of 1 scalar over 2 vectors). To 
see this, let k € Rand z = a + bi,w = c + di E C. Then since multiplication distributes over 
addition in R, we have 


k(z +w) = k((a + bi) + (c + di)) = k((a +c) + (b+ d)i)=k(a+c)+k(b+d)i 
(etka T bs ha = kaet h he kd Sh Ea) ketd = ee. 


6. Forallj,k € Randz E C, (j + k)z = jz + kz (Distributivity of 2 scalars over 1 vector). To see 
this, let j,k E€ Rand z = a + bi E C. Then since multiplication distributes over addition in R, 
we have 


(j +k)z= (j +k)(a+bi)=(j+k)a+ (j+ k)bi = (ja + ka) + (jb + kb)i 
= (ja + jbi) + (ka + kbi) = j(a + bi) + k(a + bi) = jz + kz. 


Notes: (1) Since the properties listed in 1 through 6 above are satisfied, we say that C is a vector space 
over R. We will give the formal definition of a vector space below. 


(2) Note that a vector space consists of (i) a set of vectors (in this case C), (ii) a field (in this case R), 
and (iii) two operations called addition and scalar multiplication. 


(3) The operation of addition is a binary operation on the set of vectors, and the set of vectors together 
with this binary operation forms a commutative group. In the previous example (Example 8.1), we have 
that (C, +) is a commutative group. 


(4) Scalar multiplication is not a binary operation on the set of vectors. It takes pairs of the form (k, v), 
where k is in the field and v is a vector to a vector kv. Formally speaking, scalar multiplication is a 
function f: F x V > V, where F is the field of scalars and V is the set of vectors (see the beginning of 
Lesson 3 for a brief explanation of this notation). 


(5) We started with the example of C as a vector space over R because 
it has a geometric interpretation where we can draw simple pictures 
to visualize what the vector space looks like. Recall from Lesson 7 that 
we can think of the complex number a + bi as a directed line segment 
(which from now on we will call a vector) in the complex plane that 
begins at the origin and terminates at the point (a, b). 


For example, pictured to the right, we can see the vectors i = 0 + 1i, 
1 + 2i, and 2 = 2 + Oi in the complex plane. 


We can visualize the sum of two vectors as the vector starting at the 
origin that is the diagonal of the parallelogram formed from the original vectors. We see this in the first 
figure on the left below. In this figure, we have removed the complex plane and focused on the vectors 
1 + 2i and 2, together with their sum (1 + 2i) + (2+ 0i) = (1+2)+(2+0)i=3+2i. 
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A second way to visualize the sum of two vectors is to translate one of the vectors so that its initial 
point coincides with the terminal point of the other vector. The sum of the two vectors is then the 
vector whose initial point coincides with the initial point of the “unmoved” vector and whose terminal 
point coincides with the terminal point of the “moved” vector. We see two ways to do this in the center 
and rightmost figures below. 


Technically speaking, the center figure shows the sum (1 + 2i) + 2 and the rightmost figure shows the 
sum 2 + (1 + 2i). If we superimpose one figure on top of the other, we can see strong evidence that 
commutativity holds for addition. 


(0,0) 2 (0, 0) 


We can visualize a scalar multiple of a vector as follows: (i) if k is a positive real number and z € C, then 
the vector kz points in the same direction as z and has a length that is k times the length of z; (ii) if k 
is a negative real number and z € C, then the vector kz points in the direction opposite of z and has a 
length that is |k| times the length of z; (iii) if k = 0 and z E C, then kz is a point. 


In the figures below, we have a vector z € C, together with several scalar multiples of z. 


22 = 22 
-Z=(-1)z A 
-Z --Z 


"ap aa 

We are now ready for the general definition of a vector space. 
A vector space over a field F is a set V together with a binary operation + on V (called addition) and 
an operation called scalar multiplication satisfying: 

(1) (V, +) is a commutative group. 

(2) (Closure under scalar multiplication) For all k € Fandv E V, kv EV. 

(3) (Scalar multiplication identity) If 1 is the multiplicative identity of F and v E V, then 1v = v. 

(4) (Associativity of scalar multiplication) For all j,k € F and v E V, (jk)v = j(kv). 

(5) (Distributivity of 1 scalar over 2 vectors) For all k € F and v,w E V, k(v + w) = kv + kw. 

(6) (Distributivity of 2 scalars over 1 vector) For all j,k € F and v E V, (j + k\v = jv + kv. 


Notes: (1) Recall from Lesson 3 that (V, +) a commutative group means the following: 


e (Closure) Foral v,w EV,v +w EV. 
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(Associativity) For all v,w,u E V, (v +w) +u = v + (w +u). 
(Commutativity) For all v,w E V, v +w =w +v. 
(Identity) There exists an element 0 E V such that forallv E V,0 +v =v+0 =v. 


(Inverse) For each v E V, there is -v E V such that v + (- v) = (-v) + v = 0. 


(2) The fields that we are familiar with are Q (the field of rational numbers), R (the field of real 
numbers), and C (the field of complex numbers). For our purposes here, we can always assume that F 
is one of these three fields. 


Let’s look at some basic examples of vector spaces. 


Example 8.2: 


1. Let RÊ be the set of all ordered pairs of real numbers. That is, R? = {(a,b ) | a,b € R} We 


define addition by (a,b) + (c,d) =(a+c,b+d). We define scalar multiplication by 
k(a,b) = (ka, kb) for each k € R. With these definitions, R? is a vector space over R. 


Notice that IR? looks just like C. In fact, (a, b) is sometimes used as another notation for a + bi. 
Therefore, the verification that IR? is a vector space over R is nearly identical to what we did in 
Example 8.1 above. 


We can visualize elements of IR? as points or vectors in a plane in exactly the same way that we 
visualize complex numbers as points or vectors in the complex plane. 


R? = {(a,b,c) | a,b,c € R} is a vector space over R, where we define addition and scalar 
multiplication by (a,b,c) +(d,e,f) =(a+d,b+e,c+f) and k(a,b,c) = (ka, kb, kc), 
respectively. 


We can visualize elements of R? as points in space in a way similar to visualizing elements of 
R? and Cas points in a plane. 


More generally, we can let R” = {(a4, a3, ..., an) | a; E R for each i = 1,2,...,n}. Then R” is 
a vector space over R, where we define addition and scalar multiplication by 


(a, Q2, «+, An) + (ba, bə, ..., bn) = (a1 + Dy, a2 + bz, ..., An + bn). 
k (a4, dz, ..., an) = (kay, kaz, ..., kan). 


More generally still, if F is any field (for our purposes, we can think of F as Q, R, or C), we let 
n = {(A,,Qp,...,Ay) | a; E F for each i = 1,2, ... n}. Then F” is a vector space over F, where 
we define addition and scalar multiplication by 
(a, Q2, «+, An) + (ba, bz, ..., bn) = (Ay + Dy, a2 + bz, ..., An + bn). 
k (4, a2, ..., an) = (kay, kay, ..., kan). 


Notes: (1) Ordered pairs have the property that (a, b) = (c,d) if and only if a = c and b = d. So, for 
example, (1,2) # (2,1). Compare this to the unordered pair (or set) {1,2}. Recall that a set is 
determined by its elements and not the order in which the elements are listed. So, {1, 2} = {2, 1}. 


We will learn more about ordered pairs in Lesson 10. 
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(2) (a1, a2, ..., An) is called an n-tuple. So, R” consists of all n-tuples of elements from R, and more 
generally, F” consists of all n-tuples of elements from the field F. 


For example, (3,2 — i, V2 + V3i,-3i) € C* and (15555225) € QÊ (and since Q8 € R8 € C8, 


we can also say that this 8-tuple is in R® or C8). 


(3) Similar to what we said in Note 1, we have (a4, a2, ..., an) = (ba, bo, ..., bn) if and only if a; = b; for 
alli = 1, 2, ..., n. So, for example, (2, 5, v2, v2) and 7, 5, v2) are distinct elements from R4. 


(4) You will be asked to verify that F” is a vector space over the field F in Problem 3 below. Unless 
stated otherwise, from now on we will always consider the vector space F” to be over the field F. 


Let’s look at a few other examples of vector spaces. 
Example 8.3: 


1. LetM = ($ P | a,b,c,d E R} be the set of all 2 x 2 matrices of real numbers. We add two 


b 
matrices using the rule [? g + p J = He d 7 and we multiply a matrix by a real 
ka kb 


number using the rule k [i g = |. It is straightforward to check that M is a vector 


kc kd 
space over R. 


2. Form,n E€ Zt,anm X n matrix over a field F is a rectangular array with m rows and n columns, 


1 
5 2 = 
and entries in F. For example, the matrix A = | 5 | is a 2 X 3 matrix over R. We 


-3 y3 7 


will generally use a capital letter to represent a matrix, and the corresponding lowercase letter 
with double subscripts to represent the entries of the matrix. We use the first subscript for the 
row and the second subscript for the column. Using the matrix A above as an example, we see 
that a24 =- 3 because the entry in row 2 and column 1 is - 3. Similarly, we have a41 = 5, 


1 
a12 = 2, Q43 = 5 422 = V3, and a3 = 7. 


Let MEn be the set of all m x n matrices over the field F. We add two matrices A, B € MEn to 
get A+ B E€ Mn using the rule (a + b);; = aij + bij. We multiply a matrix A E Mijn by a 
scalar k € F using the rule (ka);; = kaij. 


2 s 2 
For example, if we let A be the matrix above and B = | 5 | then we have 
-1 -V3 1 
2 
= 10 4 = 
A+B=| i i ] and 24 =| | 
= -6 2⁄3 14 


Notice that we get the entry in the first row and first column of A + B as follows: 
(a + b)i4 = ıı + b41 =54+2=7 


Similarly, we get the other two entries in the first row like this: 


1 
(a + b)y2 = a12 + yp = 2 + (-5) =-3 (a + b)i3 = a13 + big =5t 
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| leave it to the reader to write out the details for computing the entries in the second row of 


A+B. 
We get the entries in the first row of 2A as follows: 
1 2 
(24)14 = 2đ11 = 2 t 5 = 10 (2a)ı2 = 2042 = 2 a 2 = 4 (2a)13 = 2043 = 2 » 5 = 5 
| leave it to the reader to write out the details for computing the entries in the second row of 


2A. 


With the operations of addition and scalar multiplication defined as we have above, it is not too 
hard to show that MË, is a vector space over F. 


3. Let P = {ax?+bx+c]|a,b,c € R} be the set of polynomials of degree 2 with real 
coefficients. We define addition and scalar multiplication (with scalars in IR) on this set of 
polynomials as follows: 


(ax? + bx +c) + (dx? +ex+f) = (a + d)x? + (b +e)x + (c +f). 
k(ax? + bx + c) = (ka)x? + (kb)x + (kc). 
For example, if p(x) = 2x? + 3x — 5 and q(x) = -5x + 4, then p(x), q(x) € P and we have 
p(x) + q(x) = (2x? + 3x — 5) + (5x + 4) = 2x? — 2x — 1. 
3p(x) = 3(2x? + 3x — 5) = 6x? + 9x — 15. 


It is straightforward to check that P is a vector space over R. 


Subspaces 


Let V be a vector space over a field F. A subset U of V is called a subspace of V, written U < V, if it is 
also a vector space with respect to the same operations of addition and scalar multiplication as they 
were defined in V. 


Notes: (1) Recall from Note 2 following Example 3.3 that a universal statement is a statement that 
describes a property that is true for all elements without mentioning the existence of any new 
elements. A universal statement begins with the quantifier V (“For all”) and never includes the 
quantifier 3 (“There exists” or “There is”). 


Properties defined by universal statements are closed downwards. This means that if a property 
defined by a universal statement is true in V and U is a subset of V, then the property is true in U as 
well. 


For example, the statement for commutativity is Vv, w(v + w = w + v). This is read “For all v and w, 
v +w =w +v.” The quantifier V is referring to whichever set we are considering. If we are thinking 
about the set V, then we mean “For all v and w in V, v + w = w + v.” If we are thinking about the set 
U, then we mean “For all v andw inU, v +w =w +v.” 


If we assume that + is commutative in V and U & V, we can easily show that + is also commutative in 
U. To see this, let v,w E U. Since U C V, we have v,w E V. Since + is commutative in V, we have 
v +w =w + v. Since v and w were arbitrary elements in U, we see that + is commutative in U. 
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(2) Associativity, commutativity, and distributivity are all defined by universal statements, and 
therefore, when checking if U is a subspace of V, we do not need to check any of these properties— 
they will always be satisfied in the subset U. 


(3) The identity property for addition is not defined by a universal statement. It begins with the 
existential quantifier 3 “There is.” Therefore, we do need to check that the identity 0 is in a subset U 
of V when determining if U is a subspace of V. However, once we have checked that 0 is there, we do 
not need to check that it satisfies the property of being an identity. As long as 0 € U (the same 0 from 
V), then it will behave as an identity because the defining property of 0 contains only the quantifier V. 


(4) The inverse property for addition will always be true in a subset U of a vector space V that is closed 
under scalar multiplication. To see this, we use the fact that - 1v = -v for all v in a vector space (see 
Problem 4 (iv) below). 


(5) Since the multiplicative identity 1 comes from the field F and not the vector space V, and we are 
using the same field for the subset U, we do not need to check the scalar multiplication identity when 
verifying that U is a subspace of V. 


(6) The main issue when checking if a subset U of V is a subspace of V is closure. For example, we need 
to make sure that whenever we add 2 vectors in U, we get a vector that is also in U. If we were to take 
an arbitrary subset of V, then there is no reason this should happen. For example, let’s consider the 
vector space C over the field R. Let A = {2 + bi | b € R}. A is a subset of C, but A is not a subspace of 
C. To see this, we just need a single counterexample. 2 + i E A, but (2+i)+(2+i)=4+2i€¢A 
(because the real part is 4 and not 2). 


(7) Notes 1 through 6 above tell us that to determine if a subset U of a vector space V is a subspace of 
V, we need only check that 0 € U, and U is closed under addition and scalar multiplication. 


(8) The statements for closure, as we have written them do look a lot like universal statements. For 
example, the statement for closure under addition is “For all v,w E V, v +w E V.” The issue here is 
that the set V is not allowed to be explicitly mentioned in the formula. It needs to be understood. 


For example, we saw in Note 1 that the statement for commutativity can be written as 
“‘Yv,w(v +w = w + v).” The quantifier V (for all) can be applied to any set for which there is a notion 
of addition defined. We also saw that if the statement is true in V, and U is a subset of V, then the 
statement will be true in U. 


With the statement of closure, to eliminate the set V from the formula, we would need to say 
something like, “For all x and y, x + y exists.” However, there is no way to say “exists” using just logical 
notation without talking about the set we wish to exist inside of. 


We summarize these notes in the following theorem. 


Theorem 8.1: Let V be a vector space over a field F and let U © V. Then U < V if and only if (i) 0 € U, 
(ii) for allv,w E€ U, v + w E U, and (iii) for allv € U and k EF, kv E U. 


Proof: Let V be a vector space over a field F, and U CV. 
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If U is a subspace of V, then by definition of U being a vector space, (i), (ii), and (iii) hold. 
Now suppose that (i), (ii), and (iii) hold. 
By (ii), + is a binary operation on U. 


Associativity and commutativity of + are defined by universal statements, and therefore, since they 
holdin V and U GC V, they hold in U. 


We are given that 0 € U. If v E U, then since U CV, v E V. Since 0 is the additive identity for V, 
0 +v =v +0 =v. Since v E U was arbitrary, the additive identity property holds in U. 


Let v E U. Since U C V,v E V. Therefore, there is -v E V suchthat v + (- v) = (-v) + v = 0. By (iii), 
-1v E U and by Problem 4 (part (iv)), - 1v =- v. Since v € U was arbitrary, the additive inverse 
property holds in U. 


So, (U, +) is a commutative group. 
By (iii), U is closed under scalar multiplication. 


Associativity of scalar multiplication and both types of distributivity are defined by universal 
statements, and therefore, since they hold in V and U CG V, they hold in U. 


Finally, if v E€ U, then since U CV, v E V. So, 1v = v, and the scalar multiplication identity property 
holds in U. 


Therefore, U < V. o 


Example 8.4: 


1. Let V = R? = {(a,b) | a,b € R} be the vector space over R with the usual definitions of 
addition and scalar multiplication, and let U = {(a,0) |a E R}. If (a,0) E U, then a,0 E R, 
and so (a, 0) € V. Thus, U © V. The 0 vector of V is (0,0) which is in U. If (a, 0), (b, 0) € U and 
k ER, then (a,0)+(b,0)=(a+b,0)EU and k(a,0) = (ka,0) E€ U. It follows from 
Theorem 8.1 that U < V. 


This subspace U of R? looks and behaves just like R, the set of real numbers. More specifically, 


we say that U is isomorphic to R. Most mathematicians identify this subspace U of R? with R, 
and just call it R. See Lesson 11 for a precise definition of “isomorphic.” 


In general, it is common practice for mathematicians to call various isomorphic copies of certain 
structures by the same name. As a generalization of this example, if m < n, then we can say 
R” < R” by identifying (a4, az, ..., am) E R” with the vector (a4, az, ...,Am, 0,0, ...,0) E R” 
that has a tail end of n — m zeros. For example, we may say that (2, V2, 7, -=, 0, 0,0) is in R4, 
even though it is technically in R”. With this type of identification, we have R* < R”. 

2. Let V = Q? = {(a, b,c) | a,b,c E€ Q} be the vector space over Q with the usual definitions of 


addition and scalar multiplication and let U = {(a,b,c) E€ Q? | c = a + 2b}. Let’s check that 
U<V. 
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It’s clear that U CV. Since 0 = 0 + 2- 0, we see that the zero vector (0,0,0) is in U. Let 
(a,b,c), (d,e, f) E€ U and k E Q. Then we have 


(a,b,c) + (d,e, f) = (a,b,a + 2b) + (d,e,d + 2e) = (a+d,b +e,(at+d)+2(b+ e)). 
k(a,b,c) = k(a,b,a + 2b) = (ka, kb, ka + 2kb). 
These vectors are both in U, and so, by Theorem 8.1, U < V. 


3. Consider V = C as a vector space over R in the usual way and let U = {z € C | Rez = 1}. Then 
U CV, but U £ V because the zero vector is not in U. After all, 0 = 0 + Oi, and so, Re 0 = 0. 


4. LetV = {ax? +bx+c|a,b,c E€ R} be the set of polynomials of degree 2 with real coefficients 
over R, and let U = {p(x) €V|p(5) =O}. Lets check that U <V (note that if 
p(x) = ax? + bx +c, then p(5) = 25a+5b+c). 


It’s clear that U € V. The zero polynomial p(x) = 0 satisfies p(5) = 0, and so, the zero vector 
is in U. Let p(x),q(x) EU and k E€ R. Then we have p(5) +q(5) =0+0 = 0, so that 
p(x) + q(x) E U, and we have kp(5) = k - 0 = 0, so that kp(x) € U. By Theorem 8.1,U < V. 


5. Every vector space is a subspace of itself, and the vector space consisting of just the 0 vector 
from the vector space V is a subspace of V. 


In other words, for any vector space V, V < V and {0} < V. 
The empty set, however, can never be a subspace of a vector space because it doesn’t contain 


a zero vector. 


Theorem 8.2: Let V be a vector space over a field F and let U and W be subspaces of V. Then U NW 
is a subspace of V. 


Proof: Let V be a vector space over a field F and let U and W be subspaces of V. Since U <V,0 EU. 
Since W <V,0EW.S0,0€UNW.Letv,w E U N W. So, v,w E U and v,w E W. Since U < V and 
W<V,v+weEU and v+w EW. Therefore, v+w EUNW. Let vEUNW and k EF. Then 
v E U and v E W. Since U < V and W < V, kv E U and kv E W. So, kv € U N W. By Theorem 8.1, 
UNW <V. oO 


Bases 
Let V be a vector space over a field F, let v,w E V, and j,k E F. The expression jv + kw is called a 
linear combination of the vectors v and w. We call the scalars j and k weights. 
Example 8.5: Let V = R* = {(a,b) | a,b € R} be the vector space over R with the usual definitions 
of addition and scalar multiplication. Let v = (1,0), w = (0,1), j = 4, and k = - 2. We have 
jv + kw = 4(1,0) — 2(0,1) = (4,0) + (0,-2) = (4,-2). 


It follows that the vector (4, - 2) is a linear combination of the vectors (1,0) and (0, 1) with weights 4 
and - 2, respectively. 


If v,w E V, where V is a a vector space over a field F, then the set of all linear combinations of v and 
w is called the span of v and w. Symbolically, we have span{v, w} = {jv + kw | j,k € F}. 
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Example 8.6: in Example 8.5, we saw that (4, - 2) can be written as a linear combination of the vectors 
(1, 0) and (0, 1). It follows that (4, - 2) E€ span{(1, 0), (0, 1)}. 


Theorem 8.3: Let V = R? = {(a,b) | a,b E€ R} be the vector space over R with the usual definitions 
of addition and scalar multiplication. Then span{(1, 0), (0, 1)} = R?. 


Proof: Let v E€ span{(1, 0), (0, 1)}. Then there are weights j,k € R with v = j(1,0) + k(0, 1). So, we 
have v = j(1,0) + k(0,1) = (j,0) + (0, k) = (j,k). Since j,k € R, we have v = (j,k) € R’. Since 
v E span{(1, 0), (0, 1)} was arbitrary, span{(1, 0), (0,1)} © R?. 


Now, let v € R?. Then there are a,b € R with v = (a,b) = (a,0) + (0, b) = a(1, 0) + b(0, 1). Since 
we have expressed v as a linear combination of (1,0) and (0, 1), we see that v E span{(1, 0), (0, 1)}. 
Since v € R? was arbitrary, R? © span{(1, 0), (0, 1)}. 


Since span{(1, 0), (0, 1)} © IR? and R? © span{(1, 0), (0, 1)}, we have span{(1, 0), (0,1)} = R?. o 


If v,w E V, where V is a a vector space over a field F, then we say that v and w are linearly 
independent if neither vector is a scalar multiple of the other one. Otherwise, we say that v and w are 
linearly dependent. 

Example 8.7: 


1. The vectors (1,0) and (0,1) are linearly independent in R? because for any k E R, we have 
k(1,0) = (k, 0) + (0,1) and k(0, 1) = (0,k) + (1,0). 


2. The vectors (1,2) and (-3,-6) are linearly dependent in R? because (- 3,- 6) = - 3(1, 2). 
If v, w E V, where V is a vector space over a field F, then we say that {v, w} is a basis of V if v and w 
are linearly independent and span{v, w} = V. 
Example 8.8: 


1. In Example 8.7, we saw that the vectors (1,0) and (0,1) are linearly independent in R*. By 
Theorem 8.3, span{(1, 0), (0, 1)} = R?. It follows that {(1, 0), (0, 1)} is a basis of R?. 


2. In Example 8.7, we saw that the vectors (1,2) and (- 3,- 6) are linearly dependent in R?. It 
follows that {(1, 2), (- 3, - 6)} is not a basis of R?. 


We would like to generalize the notion of linear dependence to more than two vectors. The definition 
of one vector being a scalar multiple of the other isn’t quite good enough to do that. The following 
theorem gives us an alternative definition of linear dependence that generalizes nicely. 


Theorem 8.4: Let V be a vector space over a field F and let v,w E V. Then v and w are linearly 
dependent if and only if there are j,k E F, not both 0, such that jv + kw = 0. 


Proof: Let v,w E V, and suppose that v and w are linearly dependent. Then one vector is a scalar 
multiple of the other. Without loss of generality, we may assume that there is c € F with v = cw. Then 
we have 1v + (-c)w = 0. So, if we let j = 1 and k =-c, then jv + kw = 0, andj = 1 + 0. 
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Now suppose that there are j,k € F, not both 0, such that jv + kw = 0. Without loss of generality, 
assume that j + 0. Then we have jv =-kw, and so, v = - Sw. So, v is a scalar multiple of w. 


Therefore, v and w are linearly dependent. oO 


Note: See the Note following Theorem 6.6 in Lesson 6 for an explanation of the expression “Without 
loss of generality,” and how to properly use it in a proof. 


We will now extend the notions of linear dependence and independence to more than two vectors. 


Let V be a vector space over a field F, let v4, 12,...,V, E V, and kı, kz, ..., kn E F. The expression 
kivi + kav, +++: + Ky Vy is called a linear combination of the vectors v4, V2, ..., Vn. We call the scalars 
kı, k2, ..., kn weights. 


Example 8.9: Let V = R? = {(a,b,c) | a,b,c € R} be the vector space over R with the usual 
definitions of addition and scalar multiplication. Let v = (1,0,0), və = (0,1,0), v = (0,0,1), 
kı = 3, k, =-5,k3 = 6 We have 
kv, + kv + kzvz = 3(1, 0,0) — 5(0, 1,0) + 6(0, 0,1). 
= (3, 0,0) + (0,-5, 0) + (0,0,6) = (3,-5,6). 


It follows that the vector (3, - 5, 6) is a linear combination of the vectors (1, 0,0), (0, 1, 0), and (0, 0, 1) 
with weights 3, - 5, and 6, respectively. 


If v4, V2, «+, Vn E V, where V is a vector space over a field F, then the set of all linear combinations of 
V1, V2, ++, Vn E V is called the span of v4, vz, ..., Vn. Symbolically, we have 
span{vı, Vz, ..,U_} = {ky vy + kav te + Ky Vy | ky, kz, ..., kn E F}. 


Example 8.10: in Example 8.9, we saw that (3,-5,6) can be written as a linear combination of the 
vectors (1, 0,0), (0,1, 0), and (0, 0,1). It follows that (3,-5, 6) € span{(1, 0, 0), (0, 1, 0), (0, 0, 1)}. 


Theorem 8.5: Let V = R” = {(ky, kz, ..., ky) | ky, K2, «.., kn E R} be the vector space over R with the 
usual definitions of addition and scalar multiplication. Then 


span{(1, 0, 0, ..., 0), (0, 1,0, ..., 0), ..., (0,0, 0, ..., 1)} = R”. 


Proof: Let v E span{(1,0,0,...,0),(0,1,0,...,0),...,(0,0,0,..,1)}. Then there are weights 
ky, kz, ky E R with v = k,(1,0,0,...,0) + k>(0, 1,0, ..., 0) +--+ k,(0, 0,0, ...,1). So, we have 
v = (k,,0,0,...,0) + (0, k2, 0,...,0) + +++ (0,0, 0, ..., kn) = (ky, kə, ..., Kn). Since ky, kz, ..., kn E R, 
we have v = (ky, ko, ..., kn) E R”. Since v E€ span{(1, 0,0, ..., 0), (0, 1, 0,..., 0), ..., (0, 0, 0, ..., 1)} was 
arbitrary, span{(1, 0, 0, ..., 0), (0, 1,0, ...,0),..., (0,0, 0,...,1)} E R”. 


Now, let v € R”. Then there are ky, kz, ...,k, E R with 
v = (ky, kz, ..., kn) = (kı, 0,0, ...,0) + (0, kz, 0,...,0) +--+ + (0,0,0, ..., Ky) 
= k,(1,0,0,...,0) + k2(0,1,0,...,0) + +--+ k,,(0,0,0,...,1). 
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Since we have expressed v as a linear combination of (1,0,0,...,0), (0,1,0,...,0),...,(0,0,0,...,1), 
we see that v E€ span{(1,0,0,...,0), (0, 1,0, ..., 0), ..., (0, 0, 0, ..., 1)}. Since v € R” was arbitrary, we 
have R” € span{(1,0,0,...,0), (0,1, 0,...,0), ..., (0, 0,0, ..., 1)}. 


Therefore, span{(1, 0, 0,...,0), (0,1, 0,...,0),..., (0,0, 0,...,1)} = R”. o 


If v4, V2, «+, Vn E V, where V is a vector space over a field F, then we say that v4, V2, ..., Vn are linearly 
dependent if there exist weights k4, k2, ..., kn € F, with at least one weight nonzero, such that 
kivi + kav, +++» + knYVn = 0. Otherwise, we say that v4, V2, ..., Vn are linearly independent. 


Notes: (1) v4, V2, ..., Vn are linearly independent if whenever we write k,v, + k2v2 + + knn = 0, 
it follows that all the weights k4, k2, ..., Kn are 0. 


(2) We will sometimes call the expression k,v, + k2v2 + + + k,v, = 0 a dependence relation. If any 
of the weights k4, k2,...,Kn are nonzero, then we say that the dependence relation is nontrivial. 
Example 8.11: 


1. The three vectors (1,0,0), (0,1,0), and (0,0,1) are linearly independent in R?. To see this, 
note that we have 


kd, 0, 0) + k,(0, 1, 0) + k3(0, 0, 1) = (ki, 0, 0) + (0, kp, 0) + (0, 0, k3) = (ki, kp, k3). 


So, kı (1,0,0) + k2 (0,1,0) + k3 (0, 0, 1) = (0,0, 0) if and only if (ky, k2, k3) = (0, 0, 0) if and 
only if kı = 0, k = 0, and k; = 0. 


2. A similar computation shows that the n vectors (1,0,0, ...,0), (0, 1,0, ...,0), ..., (0,0,0, ...,1) 
are linearly independent in R”. 


3. The vectors (1,2,3), (- 2, 4,3), and (1, 10, 12) are linearly dependent in R?. To see this, note 
that 3(1, 2,3) + (- 2,4,3) = (3,6,9) + (- 2, 4,3) = (1, 10, 12), and therefore, 


3(1, 2,3) + (- 2,4,3) — (1, 10,12) = 0. 
This gives us a nontrivial dependence relation because we have at least one nonzero weight (in 


fact, all three weights are nonzero). The weights are 3, 1, and - 1. 


If 14, V2, ..., Vn E V, where V is a vector space over a field F, then we say that {v4, V2, ..., Vn} is a basis 
of V if 4, v2, ..., Vn are linearly independent and span{ v4, Vz, ...,U,} = V. 
Example 8.12: 


1. In Example 8.11, we saw that the vectors (1,0,0), (0,1,0), and (0,0,1) are linearly 
independent in R°. By Theorem 8.5, span{(1, 0,0), (0, 1, 0), (0, 0, 1)} = R°. It follows that 
{(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis of R°. 


Similarly, {(1, 0, 0, ..., 0), (0, 1,0, ..., 0), ..., (0,0, 0, ..., 1)} is a basis of R”. 


2. In Example 8.11, we saw that the vectors (1,2,3), (- 2,4,3), and (1, 10,12) are linearly 
dependent in R?. It follows that {(1, 2, 3), (- 2, 4, 3), (1, 10, 12)} is not a basis of R°. 
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Problem Set 8 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 
1. Determine if each of the following subsets of R? is a subspace of R°: 
© A={t@y)|x+y=0} 
(i) B ={(x,y) | xy = 0} 
Gii) C = {(x,y) |2x = 3y} 
(iv) D = {(x,y)|x € Q} 


2. For each ofthe following, determine if the given pair of vectors v and w are linearly independent 
or linearly dependent in the given vector space V: 


; AAA = a 2 2 1 
O V=Qt,v=(,2,2,-1),w = (-1,-2,-5,-3) 
Gi) V = R?, v =(1,v2,1), w = (v2,2,v2) 
(iii) V = C, v = (1,i,2-i,0,3i),w = (-i,1,- 1 — 2i, 0,3) 
b 


(iv) v=u% v= | a | (2 #0 a+b) 


NIRQ 


b 
3bĻW 7 


NIe e 


(v) V = {ax? + bx +c |a,b,c € R}, v= x, w = x? 


LEVEL 2 


3. Let F be a field. Prove that F” is a vector space over F. 

4. Let V be a vector space over F. Prove each of the following: 
(i) Foreveryv €V,-(-v) =v. 
(ii) For every v €V,0v = 0. 
(iii) For every k E F, k -0 = 0. 


(iv) For every v E V, -1v =-v. 


LEVEL 3 


5. Let V be a vector space over a field F and let X be a set of subspaces of V. Prove that NX is a 
subspace of V. 


6. Prove that a finite set with at least two vectors is linearly dependent if and only if one of the 
vectors in the set can be written as a linear combination of the other vectors in the set. 
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LEVEL 4 


7. Let U and W be subspaces of a vector space V. Determine necessary and sufficient conditions 
for U U W to be a subspace of V. 


8. Give an example of vector spaces U and V with U © V such that U is closed under scalar 
multiplication, but U is not a subspace of V. 


LEVEL 5 


9. Let S bea set of two or more linearly dependent vectors in a vector space V. Prove that there is 
a vector v in the set so that span S = span S \ {v}. 


10. Prove that a finite set of vectors S in a vector space V is a basis of V if and only if every vector 
in V can be written uniquely as a linear combination of the vectors in S. 


11. Let S = {v1, V2,...,Vm_} be a set of linearly independent vectors in a vector space V and let 
T = {w1, W», ..., Wn} be a set of vectors in V such that span T = V. Prove that m < n. 


12. Let B be a basis of a vector space V with n vectors. Prove that any other basis of V also has n 
vectors. 
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LESSON 9 - LOGIC 
LOGICAL ARGUMENTS 


Statements and Substatements 


In Lesson 1, we introduced propositional variables such as p, q, and r to represent the building blocks 
of statements (or propositions). 


We now define the set of statements a bit more formally as follows: 


1. We have a list of symbols p, q,r, ... called propositional variables, each of which is a statement 
(these are the atomic statements). 


2. Whenever ¢ is a statement, (~g) is a statement. 


3. Whenever ¢ and w are statements, ( A Y), (PV y), ( > y), and (¢ © y) are statements. 


Notes: (1) For easier readability, we will always drop the outermost pair of parentheses. For example, 
we will write (p A q) as p Aq, and we will write (p > (qv r))asp > (qvr). 


(2) Also, for easier readability, we will often drop the parentheses around (~g) to get ~g. For 
example, we will write (p A (Aq)) as pA-q. Notice that we dropped the outermost pair of 
parentheses to get p A (~q), and then we dropped the parentheses around nq. 


(3) When we apply the negation symbol two or more times in a row, we will not drop parentheses. For 
example, (a(Ap)) will be written as —(3p) and not as —=-7p. 


(4) œ is called a substatement of (~g). For example, p is a substatement of =p (=p is the abbreviated 
version of (-p)). Similarly, @ and y are substatements of (PAW), (OVW), ($ > y), and (de 4). 
For example, p and q are both substatements of p e q. Also, if ọ is a substatement of w and w is a 
substatement of t, then we will consider @ to be a substatement of t. For example, p is a substatement 
of =(-p) because p is a substatement of ap and ~p is a substatement of =(-p). 


(5) Although we are abbreviating statements by eliminating parentheses, it is important to realize that 
those parentheses are there. If we were to use A to form a new statement from p > q andr, it would 
be incorrect to write p > q Ar. This expression is meaningless, as we do not know whether to apply 
p > q or q Ar first. The correct expression is (p > q) Ar. This is now an acceptable abbreviation for 
the statement ((p > q) Ar). 


Notice that p,q,r, and p >q are all substatements of (p > q)Ar, whereas qAr is not a 
substatement of (p > q) Ar. 
Example 9.1: Let p,q, and r be propositional variables. Then we have the following: 

1. p,g,and,r are statements. 


2. (p > q) is a statement (by 3 above). Using Note 1, we will abbreviate this statement as p > q. 
p and q are both substatements of p > q. 
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Example 9.2: Let’s find the substatements of ((p > q) Var) © A(q Ar). 


Solution: The substatements are p,q,7, a”, p > q, (p > q) Var,qAr,and =(q Ar). 


Note: The given statement is an abbreviation for( ((p >q)V (ar)) > (A(q Ar))). This is much 


harder to read, and shows why we like to use abbreviations. 


Logical Equivalence 


Let ġ and w be statements. We say that ¢ and y are logically equivalent, written ġ = y, if every truth 
assignment of the propositional variables appearing in either @ or w (or both) leads to the same truth 
value for both statements. 


Example 9.3: Let p be a propositional variable, let @ = p, and let Yy = A(-p). If p = T, then 6 =T 
and Y = A(AT) = AF = T. If p =F, then @ = F and Y = -(-4F) = AT =F. So, both possible truth 
assignments of p lead to the same truth value for @ and w. It follows that œ = w (¢ and y are logically 
equivalent). 


Notes: (1) One way to determine if two statements @ and wy are logically equivalent is to draw the truth 
table for each statement. We would generally put all the information into a single table. If the columns 
corresponding to ġ and w are a perfect match, then @ = yw. 


Here is a truth table with columns for @ = p and Y = (7p). 


p_| -r |=Gp) 
EA 
E| T| F 


Observe that the first column gives the truth values for @, the third column gives the truth values for 
w, and both these columns are identical. It follows that @ = w. 


(2) The logical equivalence p = —(-7p) is called the law of double negation. 

Example 9.4: Let p and q be propositional variables, let p = a(p Aq), and let Y = np V nq. If p = F 
or q =F, then 6 = ~F =T and y =T (because ~p =T or ~q = T). If p=T and q =T, then 
$ = AT=F and 4Y =FVF=F. So, all four possible truth assignments of p and q lead to the same 
truth value for ġ and w. It follows that @ = w. 


Notes: (1) Here is a truth table with columns for ¢ = a(p A q) and Y = ap V ~q. 


p q | -p | 74 | p^q |P^4)| =PV-7q 
T. T F F T F F 
T F F T F T T 
F T | T F F T T 
F F T T F T T 
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Observe that the sixth column gives the truth values for @, the seventh column gives the truth values 
for w, and both these columns are identical. It follows that @ = w. 


(2) The logical equivalence =(p Aq) = ap V nq is one of De Morgan’s laws. 


(3) There are two De Morgan’s laws. The second one is ~(p V q) = ap A ~q. | leave it to the reader 
to verify this equivalence. 


List 9.1: Here is a list of some useful logical equivalences. The reader should verify each of these by 
drawing a truth table or by using arguments similar to those used in Examples 9.3 and 9.4 (see Problem 
2 below). 


1. Law of double negation: p = (-p) 


2. De Morgan’s laws: alp Aq) = ap V nq alp V q) = ap A7q 

3. Commutative laws: pPAG=aqQAp pYyYq4=qVp 

4. Associative laws: (pAqg)Ar=pA(qAr) (pVq)Vr=pV(qvr) 

5. Distributive laws: pA(qVr)=(pAqV(@MAr) PV(qAr) =(pVqA(PVr) 
6. Identity laws: pAT=p pAF=F pVT=T pVF=p 

7. Negation laws: pAnp=F pVap =T 

8. Redundancy laws: pAp=D pVp=p 

9. Absorption laws: (pVq)AD=DP (pAq)Vp=pP 

10. Law of the conditional: p—-q=-7pVq 

11. Law of the contrapositive: p > q = aq > ap 


12. Law of the biconditional: p e q = (p > q) A (q > p) 


Notes: (1) Although this is a fairly long list of laws, a lot of it is quite intuitive. For example, in English 
the word “and” is commutative. If the statement “I have a cat and | have a dog” is true, then the 
statement “I have a dog and I have a cat” is also true. So, it’s easy to see that p Aq = q A p (the first 
law in 3 above). As another example, the statement “I have a cat and | do not have a cat” could never 
be true. So, it’s easy to see that p A ap = F (the first law in 7 above). 


(2) The law of the conditional allows us to replace the conditional statement p — q by the more 
intuitive statement ap V q. We can think of the conditional statement p > q as having the hypothesis 
(or premise or assumption) p and the conclusion q. The disjunctive form ap V q tells us quite explicitly 
that a conditional statement is true if and only if the hypothesis p is false or the conclusion q is true. 


(3) A statement that has truth value T for all truth assignments of the propositional variables is called 
a tautology. A statement that has truth value F for all truth assignments of the propositional variables 


is called a contradiction. 


In laws 6 and 7 above, we can replace T by any tautology and F by any contradiction, and the law still 
holds. For example, since q @ ~q is a contradiction, by the fourth identity law, p V (q © ~q) = p. 
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(4) It’s worth observing that if @ and W are sentences, then @ = y if and only if @ © y is a tautology. 
This follows from the fact that @ e y = T if and only if @ and y have the same truth value. 


For example, (p > q) e (Aq > 7p) is a tautology. This follows from the law of the contrapositive and 
the remark in the last paragraph. Let’s look at the complete truth table for this example. 


p q | -p | -q |p-4a|-a>-p PEDE OP> 79q) 
T T F F T T T 
T F F T F F T 
F T T F T T T 
F F T T T T T 


Notice how the columns for (p > q) and (~q > 7p) have the same truth values. So, it should be 
obvious that the column for (p > q) e (~q > 7p) will have only T’s. 


The following three additional laws of logical equivalence will be used freely (often without mention): 


1. Law of transitivity of logical equivalence: Let œ, p, and T be statements such that @ = y and 
p =t.Then @ =T. 


2. Law of substitution of logical equivalents: Let @, Y, and T be statements such that @ = W and 
¢ is a substatement of T. Let T* be the sentence formed by replacing @ by w inside of t. Then 
TŽ ET. 


3. Law of substitution of sentences: Let @ and w be statements such that d = y, let p be a 
propositional variable, and let t be a statement. Let ġ* and w* be the sentences formed by 
replacing every instance of p with T in @ and y, respectively. Then ġ* = yw". 


Example 9.5: 


1. Since q = a(-@) (by the law of double negation), we have p A q = p A —(~q). Here we have 
used the law of substitution of logical equivalents with @ = q, Y = a(-q), T=p Aq, and 
= pA a(7q). 

2. Let’s show that the negation of the conditional statement p > q is logically equivalent to the 
statement p A nq. 


We have a(p > q) = “(ap V q) = Aap) Anq = p Anq. Here we have used the law of 
substitution of logical equivalents together with the law of the conditional, the second De 
Morgan’s law, the law of double negation, and the law of transitivity of logical equivalence. 


3. Sincep > q = 7p V q (by the law of the conditional), (p Aq) > (pV q) = np hq) V (pV q). 
Here we have used the law of substitution of sentences twice. We replaced the propositional 
variable p by the statement p A q, and then we replaced the propositional variable q by the 
statement p V q. 


Notes: (1) If you think about the equivalence ~(p > q) = p Anq from part 2 of Example 9.5 for a 
moment, you will realize that it makes perfect sense. Again, we can think of the conditional statement 
p > q as having the hypothesis p and the conclusion q. We know the only way to make a conditional 
statement false is to make the hypothesis true and the conclusion false. 
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So, to make the negation of the conditional statement true, we would do the same thing. In other 
words, the negation of the conditional is true if p is true and q is false, or equivalently, if p A ~q is true. 


In summary, the logical equivalence —=(p > q) = p Anq says that a conditional statement is false if 
and only if the hypothesis is true and the conclusion is false. 


(2) By the second associative law, (pV q) Vr =pV(qVr). So, we can write pV qVr because 
whichever way we choose to think about it (p V q first or q V r first), we get the same truth values. 


In part 3 of Example 9.5, we saw that (p Aq) > (pV q) = A(pAq) V (p V q). By our remarks in the 
last paragraph, we can write =(p Aq) V (pV q) as n(p A q) V p V q without causing any confusion. 


Example 9.6: Let’s show that the statement p A [(p A 7q) V q] is logically equivalent to the atomic 
statement p. 


Solution: 
PAIPA>AdVql=PAlaV PAA7Q))=PALGVDAGV Ag] =PALGVpD)ATI 
=pAqVp)=QVp)Ap=(MVQAp=p 
So, we see that p A [(p A ~q) V q] is logically equivalent to the atomic statement p. 


Notes: (1) For the first equivalence, we used the second commutative law. 
(2) For the second equivalence, we used the second distributive law. 

(3) For the third equivalence, we used the second negation law. 

(4) For the fourth equivalence, we used the first identity law. 

(5) For the fifth equivalence, we used the first commutative law. 

(6) For the sixth equivalence, we used the second commutative law. 

(7) For the last equivalence, we used the first absorption law. 


(8) We also used the law of transitivity of logical equivalence and the law of substitution of logical 
equivalents several times. 


Validity in Sentential Logic 


A logical argument or proof consists of premises (statements that we are given) and conclusions 
(statements we are not given). 


One way to write an argument is to list the premises and conclusions vertically with a horizontal line 
separating the premises from the conclusions. If there are two premises œ and w, and one conclusion 
T, then the argument would look like this: 


qe s 
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Example 9.7: Let’s take p > q and p to be premises and q to be a conclusion. Here is the argument. 


pod 
p 
q 


A logical argument is valid if every truth assignment that makes all premises true also makes all the 
conclusions true. A logical argument that is not valid is called invalid or a fallacy. 


There are several ways to determine if a logical argument is valid. We will give three methods in the 
next example. 


Example 9.8: Let’s show that the logical argument given in Example 9.7 is valid. The premises are 
p > q and p, and the conclusion is q. 


p>q 
p 
q 


Solution: Let’s use a truth table to illustrate the three methods. 


p q |p>a| M>@rp | p>^p]l>q 
TIT T T T 
T F F F T 
F T T F T 
F F T F T 


There are several ways to use this truth table to see that the logical argument is valid. 


Method 1: We use only the first three columns. We look at each row where both premises (columns 1 
and 3) are true. Only the first row satisfies this. Since the conclusion (column 2) is also true in the first 
row, the logical argument is valid. Symbolically, we write p > q,p H- q, and we say that {p > q, p} 
tautologically implies q. 


Method 2: We can take the conjunction of the premises, as we did in column 4. We look at each row 
where this conjunction is true. Again, only the first row satisfies this. Since the conclusion (column 2) is 
also true in the first row, the logical argument is valid. Symbolically, we write (p > q) Ap + q, and we 
say that (p > q) A p tautologically implies q. 


Method 3: We can use the conjunction of the premises as the hypothesis of the conditional with the 
appropriate conclusion, as we did in column 5. We now check that this statement is a tautology. 
Symbolically, we can write + [(p > q) Ap] > q (this can be read “[(p > q) A p] > q is a tautology”). 


Notes: (1) A valid argument is called a rule of inference. The rule of inference in this example is called 
modus ponens. 


(2) We didn’t need to draw a whole truth table to verify that the argument presented here was valid. 
For example, for Method 1, we could argue as follows: If p = T and p > q =T, then we must have 
q = T because if q were false, we would have p > q ET > F £F. 
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(3) p and q could be any statements here. For example, suppose p is the statement “Pigs have wings,” 
and q is the statement “pigs can fly.” Then the argument looks like this: 


If pigs have wings, then they can fly 


Pigs have wings 
Pigs can fly 


This seems like a good time to point out that just because a logical argument is valid, it does not mean 
that the conclusion is true. We have shown in the solution above that this argument is valid. However, 
| think we can all agree that pigs cannot fly! 


(4) We say that a logical argument is sound if it is valid and all the premises are true. Note 3 above 
gives an example of an argument that is valid, but not sound. 


Every tautology gives us at least one rule of inference. 


Example 9.9: Recall the first De Morgan’s law: a(p A q) = ap V ~q. This law gives us the following 
two rules of inference. 


a(p Aq) ap V nq 
ap V =q a(p Aq) 


To show that an argument is invalid, we need only produce a single truth assignment that makes all the 
premises true and the conclusion (or one of the conclusions) false. Such a truth assignment is called a 
counterexample. 


Example 9.10: The following invalid argument is called the fallacy of the converse. 


pq 
q>p 


To see that this argument is invalid, we will find a counterexample. Here we can use the truth 
assignment p = F, q = T. We then have that p >q =F > T=Tandq ~p=T-F=F. 


Notes: (1) Consider the conditional statement p > q. The statement q > p is called the converse of 
the original conditional statement. The argument in this example shows that the converse of a 
conditional statement is not logically equivalent to the original conditional statement. 


(2) The statement ap > ~q is called the inverse of the original conditional statement. This statement 
is also not logically equivalent to the original conditional statement. The reader should write down the 
fallacy of the inverse and give a counterexample to show that it is invalid (as we did above for the 
converse). 


(3) The statement ~q > ~p is called the contrapositive of the original conditional statement. By the 
law of the contrapositive, this statement is logically equivalent to the original conditional statement. 
The reader should write down the law of the contrapositive as a rule of inference, as was done for the 
first De Morgan’s law in Example 9.9. 
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List 9.2: Here is a list of some useful rules of inference that do not come from tautologies. The reader 
should verify that each of the logical arguments given here is valid (see Problem 6 below). 


Modus Ponens 


Modus Tollens 


Disjunctive Syllogism 


Hypothetical Syllogism 


pq pq pvq pq 
p mq ap q>r 
q =p q por 
Conjunctive Disjunctive Biconditional Constructive Dilemma 
Introduction Introduction Introduction p> q 
p p p>q r>s 
q pvq q >p pvr 
pAq peq q Vs 
Conjunctive Disjunctive Biconditional Destructive Dilemma 
Elimination Resolution Elimination p> q 
pAq pvq peq r>s 
p ap Vr p7q aq V aS 
q Vr ap V ar 


A derivation is a valid logical argument such that each conclusion follows from the premises and 
conclusions above it using a rule of inference. 


When creating a derivation, we will label each premise and conclusion with a number and state the 
rule of inference and numbers that are used to derive each conclusion. 


Example 9.11: Let’s give a derivation of the following logical argument. 


=p 
ap > 7q 
aq Vp 
Solution: 
1| ap Premise 
2|~p >q Premise 
3 | ~q Modus ponens (2, 1) 
4|aqVp Disjunctive introduction (3) 


Notes: (1) We started by listing the premises above the line. 


(2) If we let @ = ap and w = ~q, then by modus ponens, we have ¢ > 4y, ¢ + w. So, we can write 
p = ~q as the third line of the derivation. We applied modus ponens to the sentences in lines 2 and 1 
to derive ~q. 
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(3) If we let @ = ~q, then by disjunctive introduction, we have @ V p = anq V p. So, we can write 
~q V pas the fourth line of the derivation. We applied disjunctive introduction to the sentence in line 
3 to derive ~q V p. 


Example 9.12: Let’s determine if the following logical argument is valid. 


If cats hiss and purr, then dogs can talk. 
Cats hiss. 

Dogs cannot talk. 

Therefore, cats do not purr. 


Solution: Let h represent “Cats hiss,” let p represent “Cats purr,” and let t represent “Dogs can talk.” 
We now give a derivation showing that the argument is valid. 


1|(hAp)>t Premise 

2|h Premise 

3 | at Premise 

4| A(hAp) Modus tollens (1, 3) 

5 | ah Vap De Morgan’s law (4) 

6 | =(Ah) Law of double negation (2) 
7 | ap Disjunctive syllogism (5, 6) 


Note: The derivation in the solution above shows us that the logical argument is valid. However, notice 
that the statement we derived is false. After all, cats do purr. So, although the logical argument is valid, 
it is not sound (see Note 4 following Example 9.8). This means that one of the premises must be false. 
Which one is it? Well cats do hiss and dogs cannot talk. So, the false statement must be “If cats hiss 
and purr, then dogs can talk.” If it’s not clear to you that this statement is false, use the law of the 
conditional to rewrite it as “Neither cats hiss nor purr, or dogs can talk.” Since cats do hiss and purr, 
the statement “Neither cats hiss nor purr” is false. Since dogs cannot talk, the statement “Dogs can 
talk” is also false. Therefore, the disjunction of those two statements is false. 
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Problem Set 9 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 


1. Let ¢ be the following statement: (p A nq) © alp V (ar > q)]. 
(i) The statement @ is abbreviated. Write @ in its unabbreviated form. 


(11) Write down all the substatements of @ in both abbreviated and unabbreviated form. 


2. Verify all the logical equivalences given in List 9.1. 


LEVEL 2 
3. Let ġ, yp, and T be statements. Prove that ġ + w and Y F T implies ġ F T. 


4. Let ġ and w be statements. Prove that ¢ + wy if and only if ġ > y is a tautology. 


LEVEL 3 


5. Determine if each of the following statements is a tautology, a contradiction, or neither. 
(i) pAp 
Gi) paap 
(ili) (p V =p) > (p Aap) 
(iv) a@Vq) e ap ^ng) 
v) p> (qr) 
(vi) (eg > P> q) 


6. Verify all the rules of inference given in List 9.2. 


LEVEL 4 


7. Determine whether each of the following logical arguments is valid or invalid. If the argument is 
valid, provide a deduction. If the argument is invalid, provide a counterexample. 


I II Ul IV 
pvq a(p Aq) ap p> 4q 
q q pvr TS 4g 
T =p q> ar Par 
=q 
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8. Simplify each statement. 
O pV (pA-7p) 
(ii) (paq) Vap 
(iii) ~p > (nq > p) 
(iv) (pAn7q)Vp 
(vy) [@Ap)Vq]A[@Vp) Ap] 


LEVEL 5 


9. Determine if the following logical argument is valid. If the argument is valid, provide a 
deduction. If the argument is invalid, provide a counterexample. 


If a piano has 88 keys, then the box is empty. 

If a piano does not have 88 keys, then paintings are white. 

If we are in immediate danger, then the box is not empty. 
Therefore, paintings are white or we are not in immediate danger. 


10. Determine if the following logical argument is valid. If the argument is valid, provide a 
deduction. If the argument is invalid, provide a counterexample. 


Tangs have fangs or tings have wings. 

It is not the case that tangs have fangs and tings do not have wings. 

It is not the case that tangs do not have fangs and tings have wings. 

Therefore, tangs have fangs and either tings have wings or tangs do not have 
fangs. 
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LESSON 10 - SET THEORY 
RELATIONS AND FUNCTIONS 


Relations 


An unordered pair is a set with 2 elements. Recall, that a set doesn’t change if we write the elements 
in a different order or if we write the same element multiple times. For example, {0,1} = {1, 0} and 


{0,0} = {0}. 


We now define the ordered pair (x, y) in such a way that (y, x) will not be the same as (x,y). The 
simplest way to define a set with this property is as follows: 


(xy) = ffx}, bo} 
We now show that with this definition, the ordered pair behaves as we would expect. 
Theorem 10.1: (x,y) = (z,w) if and only if x = z and y = w. 


Part of the proof of this theorem is a little trickier than expected. Assuming that (x, y) = (z, w), there 
are actually two cases to consider: x = y and x + y. If x = y, then (x, y) is a set with just one element. 
Indeed, (x, x) = {{x}, ix x}} = fx} {x}} = {{x}}. So, the only element of (x, x) is {x}. Watch carefully 
how this plays out in the proof. 


Proof of Theorem 10.1: First suppose that x = z and y = w. Then by direct substitution, {x} = {z} and 


{x,y} = {z,w}. So, (xy) = {x} t yh = {fh Zw} = zw). 


Conversely, suppose that (x,y) = (z,w). Then {{x}, 1x, y} = {{z}, {z, w}. There are two cases to 
consider. 


Case 1: If x = y, then fx} {x, y} = {{x}}. So, {{x}} = {{z}, {z, w}. It follows that {z} = {x} and 
{z,w} = {x}. Since {z, w} = {x}, we must have z = x and w = x. Therefore, x, y, z, and w are all equal. 
In particular, x = z and y = w. 


Case 2: If x + y, then {x, y} is a set with two elements. So, {x, y} cannot be equal to {z} (because {z} 
has just one element). Therefore, we must have {x, y} = {z, w}. It then follows that {x} = {z}. So, we 
have x = z. Since x = z and {x, y} = {z, w}, we must have y = w. o 


Note: (x,y) is an abbreviation for the set Ix: {x, y}. In the study of Set Theory, every object can be 
written as a set like this. It’s often convenient to use abbreviations, but we should always be aware 
that if necessary, we can write any object in its unabbreviated form. 


We can extend the idea of an ordered pair to an ordered k-tuple. An ordered 3-tuple (also called an 
ordered triple) is defined by (x,y,z) = ((x, y) z), an ordered 4-tuple is (x, y,z,w) = ((x, y, z),w), 
and so on. 
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Example 10.1: Let’s write the ordered triple (x, y, Z) in its unabbreviated form (take a deep breath!). 
(æ, y, 2) = (Cx, y), z) = {Ee Vy) 23} = A & vB}, f, & v3.2} 


The Cartesian product of the sets A and B, written A x B is the set of ordered pairs (a,b) witha E A 
and b E B. Symbolically, we have 


AxB={(a,b)ļ|aEAAb EB}. 


Example 10.2: 
1. Let A = {0,1,2} and B = {a,b}. Then A x B = {(0,a), (0, b), (1, a), (1, b), (2, a), (2, b)}. 
2. Let C = Ø and D = {a,b,c,d}. Then C x D = Ø. 
3. Let E = {Ø} and F = {A, x}. Then E x F = {(@,A), (Ø,*)}. 


We can extend the definition of the Cartesian product to more than two sets in the obvious way: 


AxBxC={(a,bc)ļaEAAbEBACEC} 
AxBxCXxD={(a,b,c,d)ļaEAAbEBAcCcECAdED} 


Example 10.3: 
1. {a} x {1} x {A} x {x} = {(a, 1,4, «)} 
2. {0} x {0,1} x {1} x {0,1} x {0} = {(0, 0, 1,0,0), (0,0, 1,1,0), (0,1, 1, 0,0), (0,1,1,1,0)} 


We abbreviate Cartesian products of sets with themselves using exponents. 


A*=AxA AÆ=AXAXA A*=AxAXAXA 


Example 10.4: 
1. R? = R x R= {(x,y) | x,y € R} is the set of ordered pairs of real numbers. 


2. NN=NxNXNXNXN = {(a,b,c,d,e) | a,b,c,d,e € N} is the set of ordered 5-tuples of 
natural numbers. 


3. {0,1}? = {0,1} x {0,1} = {(0, 0), (0, 1), (1,0), (1, 1}. 


A binary relation on a set A is a subset of A? = A x A. Symbolically, we have 


R is a binary relation on A if and only if R GA x A. 


We will usually abbreviate (a,b) € R as aRb. 
Example 10.5: 


1. Let R = {(a,b) E N X N | a < b}. For example, we have (0,1) E R because 0 < 1. However, 
(1,1) € R because 1 ¢ 1. We abbreviate (0,1) € R by OR1. 


Observe that R C N x N, and so, R is a binary relation on N. 
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We would normally use the name < for this relation R. So, we have (0,1) € <, which we 
abbreviate as 0 < 1, and we have (1,1) ¢ <, which we abbreviate as 1 < 1. 


2. There are binary relations <, <, >, > defined on N, Z, Q, and R. For example, if we consider 
>S Q’, we have (=Z -=) € >, or equivalently, = > -= 
= i 2e n5 ý iog 5 
3. Let R= {((a,b), (c, d)) E (Z x Z*)? | ad = be}. Then R is a binary relation on Z x Z*. For 
example, (1,2)R(2,4) because 1-4 = 2.2. However, (1,2)R(2,5) because 1-5 + 2-2. 
Compare this to the rational number system where we have - = - because 1-4 = 2-2, but 


~ # <because 1:5 # 2-2. 


We say that a binary relation R on A is 
e reflexive if for alla E A, (a,a) E R. 
e symmetric if for all a,b E A, (a,b) E€ R implies (b,a) E R. 
e transitive if for all a, b,c € A, (a,b), (b,c) E R implies (a,c) E R. 
e antireflexive if for alla E A, (a,a) € R. 


e antisymmetric if for all a,b E A, (a,b) E R and (b,a) E R implies a = b. 


Example 10.6: 


1. Let A be any set, and let R = {(a, b) E€ A* | a = b}. Then R is reflexive (a = a), symmetric (if 
a = b, then b = a), transitive (if a = b and b = c, then a = c), and antisymmetric (trivially). If 
A + Ø, then this relation is not antireflexive because a + a is false for anya E A. 


2. The binary relations < and > defined in the usual way on Z are transitive (ifa < b andb <c, 
then a < c, and similarly for =>), reflexive (a < a anda = a), and antisymmetric (if a < b and 
b < a, then a = b, and similarly for >). These relations are not symmetric. For example, 1 < 2, 
but 2 £ 1). These relations are not antireflexive. For example, 1 < 1 is true. 


Any relation that is transitive, reflexive, and antisymmetric is called a partial ordering. 


3. The binary relations < and > defined on Z are transitive (if a < b and b < c, thena < c, and 
similarly for >), antireflexive (a 4 a and a + a), and antisymmetric (this is vacuously true 
because a < b and b < a can never occur). These relations are not symmetric (for example, 
1 < 2, but 2 ¢ 1). These relations are not reflexive (for example, 1 < 1 is false). 


Any relation that is transitive, antireflexive, and antisymmetric is called a strict partial ordering. 


4. Let R = {(0,0), (0, 2), (2, 0), (2, 2), (2, 3), (3, 2), (3, 3)} be a relation on R. Then it is easy to 
see that R is symmetric. R is not reflexive because 1 E R, but (1,1) ¢ R (however, if we were 
to consider R as a relation on {0, 2,3} instead of on R, then R would be reflexive). R is not 
transitive because we have (0, 2), (2,3) E R, but (0,3) ¢ R. R is not antisymmetric because 
we have (2, 3), (3,2) E€ R and 2 # 3. R is not antireflexive because (0,0) E R. 


We can extend the idea of a binary relation on a set A to an n-ary relation on A. For example, a 3-ary 
relation (or ternary relation) on A is a subset of A? = A x A x A. More generally, we have that R is an 
n-ary relation on A if and only if R & A”. A 1-ary relation (or unary relation) on A is just a subset of A. 
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Example 10.7: Let R = {(x,y,z) € Z? |x + y = z}. Then R is a ternary (or 3-ary) relation on Z. We 
have, for example, (1, 2,3) E R (because 1 + 2 = 3) and (1, 2,4) ¢ R (because 1 + 2 + 4). 


Equivalence Relations and Partitions 


A binary relation R on a set A is an equivalence relation if R is reflexive, symmetric, and transitive. 


Example 10.8: 


1. The most basic equivalence relation on a set A is the relation R = {(a,b) € A? | a = b} (the 
equality relation). We already saw in part 1 of Example 10.6 that this relation is reflexive, 
symmetric and transitive. 


2. Another trivial equivalence relation on a set A is the set A”. Since every ordered pair 
(a, b) is in AŻ, reflexivity, symmetry, and transitivity can never fail. 


3. We say that integers a and b have the same parity if they are both even or both odd. Define =, 
on Z by =,= {(a,b) € Z? | a and b have the same parity}. It is easy to see that =, is reflexive 
(a =, a because every integer has the same parity as itself), =, is symmetric (if a =, b, then a 
has the same parity as b, so b has the same parity as a, and therefore, b =, a), and =, is 
transitive (if a =, b and b =, c, then a, b, and c all have the same parity, and so, a =z c). 
Therefore, =, is an equivalence relation. 


Another way to say that a and b have the same parity is to say that b — a is divisible by 2, or 
equivalently, 2|b — a (see Lesson 4). This observation allows us to generalize the notion of 
having the same parity. For example, =3= {(a, b) € Z? | 3|b — a} is an equivalence relation, 
and more generally, for each n E€ Zt, =, = {(a, b) € Z? | n|b — a} is an equivalence relation. | 
leave the proof that =,, is reflexive, symmetric, and transitive on Z as an exercise (see Problem 
4 in the problem set below). 


4. Consider the relation R = {((a,b), (c d)) E€ (Z x Z*)? | ad = bc} defined in part 3 of Example 
10.5. Since ab = ba, we see that (a, b)R(a, b), and therefore, R is reflexive. If (a, b)R(c, ad), 
then ad = bc. Therefore, cb = da, and so, (c, d)R(a, b). Thus, R is symmetric. Finally, suppose 
that (a, b)R(c,d) and (c,d)R(e, f). Then ad = bc and cf = de. So, adcf = bcde. Using the 
fact that (Z, +, -) is a commutative ring, we get cd(af — be) = adcf — bcde = 0. If a = 0, 
then bc = 0, and so, c = 0 (because b + 0). So, de = 0, and therefore, e = 0 (because d + 0). 
So, af = be (because they’re both 0). If a # 0, then c # 0. Therefore, af — be = 0, and so, 
af = be. Since a=0 and a #0 both lead to af = be, we have (a,b)R(e,f). So, R is 
transitive. Since R is reflexive, symmetric, and transitive, it follows that R is an equivalence 
relation. 


Recall: (1) If X is a nonempty set of sets, we say that X is pairwise disjoint if for all A,B € X with 
A + B, A and B are disjoint (ANB = Ø). 


(2) If X is a nonempty set of sets, then union X is defined by UX = {y | there is Y € X with y E Y}. 


A partition of a set S is a set of pairwise disjoint nonempty subsets of S whose union is S. Symbolically, 
X is a partition of S if and only if 


VAEX(AŁŻ#ØAASS)AYABEX(A+#B>ANB=Ø)AUX=S. 
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Example 10.9: 


1. Let E = {2k | k € Z} be the set of even integers and let O = {2k + 1 | k € Z} be the set of odd 
integers. Then X = {E, O} is a partition of Z. We can visualize this partition as follows: 


A ee PE EA Pa  & P E 


2. Let A = {3k | k € Z},B = {3k + 1 | k E€ Zł, and C = {3k + 2 | k € Z}. Then X = {A,B,C} isa 
partition of Z. A rigorous proof of this requires results similar to those given in Example 4.7 and 
the notes following (or you can wait for the Division Algorithm, which will be presented in 
Lesson 12). We can visualize this partition as follows: 


V AE 0 BG E O E a 7c ea E S 


3. For eachn E Z, let A, = (n,n + 1]. Then X = {A,,|n E Z} is a partition of R. We can visualize 
this partition as follows: 


R = -U (—2, —1] u (—1,0] u (0,1] U (1, 2] u (2,3] U +- 
For eachr E R, let A, = {r + bi | b € R}. Then X = {A,. |r € R} is a partition of C. 
The only partition of Ø is Ø. 


The only partition of the one element set {a} is {{a}}. 


oo wn e 


The partitions of the two element set {a, b} with a + b are {{a}, {b}} and ffa, b}}. 


We will now explore the relationship between equivalence relations and partitions. Let’s begin with an 
example. 


Example 10.10: Consider the equivalence relation =, from part 3 of Example 10.8, defined by a =z b 
if and only if a and b have the same parity, and the partition {E, O} of Z from part 1 of Example 10.9. 
For this partition, we are thinking of Z as the union of the even and odd integers: 


eames ene E A PAE. Pen Sw A e em Oia Dec a 


Observe that a and b are in the same member of the partition if and only if a =, b. For example, 
-8 =, 4 and -8,4 E E, whereas -8 #, 3 and -8 € E, 3€ O. In fact, E = {n E€ Z | n =, 0} and 
O = {n E Z | n =, 1}. We call E the equivalence class of 0 and we call O the equivalence class of 1. 


Let ~ be an equivalence relation on a set S. If x E S, the equivalence class of x, written [x], is the set 
Ix] = {y ES|x ~y}. 


Example 10.10 continued: We have [0] = {y € Z | 0 =, y} = E. Observe that [2] = [0], and in fact, if 

n is any even integer, then [n] = [0] = E. Similarly, if n is any odd integer, then [n] = [1] = O. 

Example 10.11: Recall that the power set of A, written P(A), is the set consisting of all subsets of A. 
P(A) = {X |X SA} 

For example, if A = {a, b, c}, then P(A) = {9, {a}, {b}, {c}, {a, b}, {a, c}, {b,c}, {a, b, c}. We can define 

a binary relation ~ on P(A) by X ~Y if and only if |X| = |Y| (X and Y have the same number of 


elements). It is easy to see that ~ is an equivalence relation on P(A). There are four equivalence 
classes. 
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[Ø] = {Ø} 

[{a}] = {fa}, {b}, {c}} 

[{a, b}] = {{a, b}, {a, c}, {b, c}} 
[{a, b, c}] = {{a, b, c}} 


Notes: (1) {a} ~ {b} ~ {c} because each of these sets has one element. It follows that {a}, {b}, and {c} 
are all in the same equivalence class. Above, we chose to use {a} as the representative for this 
equivalence class. This is an arbitrary choice. In fact, [{a}] = [{b}] = [{c}]. 


Similarly, [{a, b}] = [{a,c}] = [{b, c}]. 


(2) The empty set is the only subset of A with 0 elements. Therefore, the equivalence class of Ø contains 
only itself. Similarly, the equivalence class of A = {a, b,c} contains only itself. 


(3) Notice that the four equivalence classes are pairwise disjoint, nonempty, and their union is P(A). 
In other words, the equivalence classes form a partition of P(A). 


Theorem 10.2: Let P be a partition of a set S. Then there is an equivalence relation ~ on S for which 
the elements of P are the equivalence classes of ~. Conversely, if ~ is an equivalence relation on a set 
S, then the equivalence classes of ~ form a partition of S. 


You will be asked to prove Theorem 10.2 in Problem 17 below. 


Important note: We will sometimes want to define relations or operations on equivalence classes. 
When we do this, we must be careful that what we are defining is well-defined. For example, consider 
the equivalence relation =, on Z, and let X = {[0], [1]} be the set of equivalence classes. 


Let’s attempt to define a relation on X by [x]R[y] if and only if x < y. Is [0]R[1] true? It looks like it is 
because 0 < 1. But this isn’t the end of the story. Since [0] = [2], if [0]R[1], then we must also have 
[2]R[1] (by a direct substitution). But 2 < 1! So, [2]R[1] is false. To summarize, [0]R[1] should be true 
and [2]R[1] should be false, but [0|R[1] and [2]R[1] represent the same statement. So, R is not a 
well-defined relation on X. 


As another example, let’s attempt to define an operation +: X x X > X by [x] + [y] = [x + y]. This is 
a well-defined operation. We proved in Theorem 4.1 from Lesson 4 that the sum of two even integers 
is even. Similar arguments can be used to show that the sum of two odd integers is even and that the 
sum of an odd integer and an even integer is odd. These results can now be used to show that the 
operation + is well-defined. For example, if [x] = [0] and [y] = [0], then x and y are even. By Theorem 
4.1, it follows that x + y is even. So, [x + y] = [0]. Since [0 + 0] = [0], [x] + [y] = [0] + [0]. The 
reader should check the other three cases to finish verifying that + is well-defined on X. 


This principle applies whenever there are elements in a set that can be represented in more than one 


way. Let’s take the set of rational numbers as an example. Each rational number has infinitely many 


representations. For example, - = E = : = +++ and so on. When verifying that (Q, +, -) is a field (see 


Problems 9 and 11 from Lesson 3), were you careful to check that addition and multiplication are 
well-defined on Q? If not, you may want to go back and do so now. Also take a look at Theorem 5.1. 
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Orderings 


A binary relation < ona set A is a partial ordering on A if < is reflexive, antisymmetric, and transitive 
on A. If we replace “reflexive” by “antireflexive,” then we call the relation a strict partial ordering on 
A (we would normally use the symbol < instead of < for a strict partial ordering). 


A partially ordered set (or poset) is a pair (A, <), where A is a set and < is a partial ordering on A. 
Similarly, a strict poset is a pair (A, <), where A is a set and < is a strict partial ordering on A. 
Example 10.12: 


1. The usual ordering < on Z = {...,- 3,- 2,- 1,0,1, 2, 3,...} isa partial ordering, and the ordering 
< on Z is a strict partial ordering. See Example 10.6 (parts 2 and 3). 


2. If A is a set, then (P(A), S) is a poset. Since every set is a subset of itself, & is reflexive. If 
X,Y E P(A) with X CY and Y CX, then X =Y by the Axiom of Extensionality (see the 
Technical note after Theorem 2.5 in Lesson 2). So, © is antisymmetric. By Theorem 2.3, © is 
transitive. The following tree diagrams give visual representations of this poset when A = {a, b} 
and A = {a,b,c}. For a detailed explanation of these diagrams, see Example 2.8 in Lesson 2. 


{a, b} {a, b, c} 


Z N | 
Á d un ugna 
Z 
Ø {a} {b}  {c} 


R 


Let (A, <) be a poset. We say that a, b € A are comparable if a < b or b < a. The poset satisfies the 
comparability condition if every pair of elements in A are comparable. A poset that satisfies the 
comparability condition is called a linearly ordered set (or totally ordered set). Similarly, a strict 
linearly ordered set (A, <) satisfies trichotomy: If a,b € A, thena < b,a =b,orb <a. 

Example 10.13: 


1. (N, <), (Z, <), (Q, <), and (R, <) are linearly ordered sets. Problem 5 from Lesson 5 (parts (i), 
(ii), and (iv)) show that (Q, <) and (R, <) are linearly ordered. 


Similarly, (N, <), (Z, <), (Q, <), and (R, <) are strict linearly ordered sets. 


2. If A has at least two elements, then (P(A), S) is not linearly ordered. Indeed, if a,b € A with 
a + b, then {a} £ {b} and {b} £ {a}. See either of the tree diagrams above at the end of 
Example 10.12. 


Functions 
Let A and B be sets. f is a function from A to B, written f: A > B, if the following two conditions hold. 
1. fESAXB. 
2. Foralla € A, there is a unique b E B such that (a,b) E f. 


124 


If f: A > B, the domain of f, written dom f, is the set A, and the range of f, written ran f, is the set 
{f (a) | a E A}. Observe that ran f © B. The set B is sometimes called the codomain of f. When we 
know that f is a function, we will abbreviate (a,b) € f by f(a) = b. 


Example 10.14: 


1. f = {(0,a), (1, a)} is a function with dom f = {0,1} andran f = {a}. Instead of (0,a) E f, we 
will usually write f(0) = a. Similarly, instead of (1,a) € f, we will write f(1) = a. Here is a 
visual representation of this function. 


f 
LAT 


This function f:{0, 1} > {a} is called a constant function because the range of f consists of a 
single element. 


Note also that f is a binary relation on the set {0,1, a}. In general, a function f: A > B is a 
binary relation on AU B. 


2. Ifa + b, then g = {(0, a), (0, b)} is not a function because it violates the second condition in 
the definition of being a function. It is, however, a binary relation on {0, a, b}. 


g 
TNN 


3. h= {(a,b) |a,b E RA^a > 0A^Aa? +b? = 2} is a relation on R that is not a function. (1,1) 
and (1,-1) are both elements of h, violating the second condition in the definition of a 
function. See the figure below on the left. Notice how a vertical line hits the graph twice. 


4. k={(a,b)|a,b€ RAb > 0 Ag? +b? = 2} is a function. See the figure above on the right. 
To see that the second condition in the definition of a function is satisfied, suppose that (a, b) 
and (a,c) are both in f. Then a? + b? = 2, a? + c? = 2, and b and c are both positive. It 
follows that b? = c?, and since b and c are both positive, we have b = c. 
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We have dom k = (- V2, v2) and ran k = (0, v2 |. So, k: (- V2, v2) > (0, v2 |. 
5. A function with domain N is called an infinite sequence. For example, let f: N > {0,1} be 


defined by g(n) = { : = is one 


“outputs” of the sequence in order in parentheses. So, we may write g as (0, 1,0,1,0,1,...). In 
general, if A is a nonempty set and f:N >A is a sequence, then we can write f as 
(f(0), F, f, -.-). 

Similarly, a finite sequence is a function with domain {0, 1, ...,n — 1} for some n. For example, 
the sequence (0, 2, 4, 6,8, 10) is the function h: {0, 1, 2,3, 4,5} > N defined by f(k) = 2k. If 
the domain of a finite sequence is {0, 1, ...,n — 1}, we say that the length of the sequence is n. 


A nice way to visualize an infinite sequence is to list the 


Observe how a finite sequence with domain {0, 1, ...,n — 1} and range A looks just like an 
n-tuple in A”. In fact, it’s completely natural to identify a finite sequence of length n with the 
corresponding n-tuple. So, (0, 2, 4,6, 8, 10) can be thought of as a 6-tuple from N£, or as the 
function A: {0, 1, 2,3, 4,5} > N defined by f (k) = 2k. 


Informally, we can think of an infinite sequence as an infinite length tuple. As one more 
example, (1, 2, 4,8, 16, 32, ... ) represents the sequence k: N > N defined by k(n) = 2”. 


Note: In the study of set theory, we define the natural numbers by letting 0 = Ø, 1 = {0}, 2 = {0,1}, 
3 = {0, 1, 2},... and so on. In general, the natural number n is the set of all its predecessors. Specifically, 
n = {0,1,2,...,n — 1}. Using this notation, we can say that a finite sequence of length n is a function 
f:n > A for some set A. For example, the function h above has domain 6, so that h:6 > N. 


A function f:A > B is injective (or one-to-one), written f: A © B, if for all a,b E A, if a + b, then 
f(a) + f(b). In this case, we call f an injection. 


Note: The contrapositive of the statement “If a + b, then f(a) + f(b)” is “If f(a) = f(b), then 
a = b.” So, we can say that a function f:A > B is injective if for all a,b € A, if f(a) = f(b), then 
a=b. 


A function f: A > B is surjective (or onto B), written f: A > B, if for all b E€ B, there is an a E€ A such 
that f(a) = b. In this case, we call f a surjection. 


A function f: A > B is bijective, written f: A = B if f is both an injection and a surjection. In this case, 
we call f a bijection. 
Example 10.15: 


1. f = {(0,a), (1, a)} from part 1 of Example 10.14 is not an injective function because f (0) = a, 
f (1) =a, and 0 # 1. If we think of f as f: {0,1} > {a}, then f is surjective. However, if we 
think of f as f:{0,1} > {a,b}, then f is not surjective. So, surjectivity depends upon the 
codomain of the function. 


f 
Lr” 
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2. k={(a,b)|a,b€ERAbD>O0Aa?2 +b? = 2} from part 4 of 
Example 10.14 is not an injective function. For example, 
(1,1) Ek because 1? +1? =1+1=2 and (-1,1) Ek (-1,1) 
because (- 1)? + 1? = 1+ 1 = 2. Notice how a horizontal 
line hits the graph twice. If we think of k as a function from 


(- V2, V2) to R*, then k is not surjective. For example, ~“ 
2 ¢ rank because for any a E R, a? +2? = a? +42 4, 
and so, aĉ? + 2? cannot be equal to 2. However, if instead we 
consider k as a function with codomain (0, v2 |, that is 
k: (- V2, V2) > (0, v2 |, then k is surjective. Indeed, if 
0<b<v2, then 0<b? <2, and so, a#=2—b*>0. 
Therefore, a = V2 — b? is a real number such that k(a) = b. 


3. Define g:R > R by g(x) = 7x — 3. Then g is injective because if 
g(a) = g(b), we then have 7a — 3 = 7b — 3. Using the fact that R is 
a field, we get 7a = 7b (by the additive inverse property), and then 
a = b (by the multiplicative inverse property). Also, g is surjective 


because if b € R, then = € R (because R is a field) and 


o(-2*) =7(-2*)-3=+3)-3=254+G-3)=5t0=5 


Therefore, g is bijective. See the image to the right for a visual 
representation of R? and the graph of the function g. 


Notice that any vertical line will hit the graph of g exactly once because 
g is a function with domain R. Also, any horizontal line will hit the 


xY 


graph exactly once because g is bijective. Injectivity ensures that each horizontal line hits the 
graph at most once and surjectivity ensures that each horizontal line hits the graph at least 


once. 


If f:A > B is bijective, we define f~1:B > A, the inverse of f, by ft = {(b,a) | (a, b) € f}. In other 


words, for each b € B, f~*(b) = “the unique a E A such that f(a) = b.” 


Notes: (1) Let f: A > B be bijective. Since f is surjective, for each b € B, there is an a E A such that 


f(a) = b. Since f is injective, there is only one such value of a. 
(2) The inverse of a bijective function is also bijective. 


Example 10.16: 


1. Define f: {0,1} > {a,b} by f = {(0, a), (1, b)}. Then f is a bijection and f~1: {a, b} > {0, 1} is 


defined by f~* = {(a,0), (b, 1)}. Observe that f~? is also a bijection. 


2. Let E = {0, 2,4, 6,8, ...} be the set of even natural numbers and let O = {1,3,5,7,9...} be the 
set of odd natural numbers. The function f: E > O defined by f (n) = n + 1 is a bijection with 


inverse f 71: O > E defined by f(n) =n — 1. 
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3: 


If X and Y are sets, we define *Y to be the set of functions from X to Y. Symbolically, we have 
Yel fix Y) 
For example, if A = {a,b} and B = {0,1}, then 4B has 4 elements (each element is a function 


from A to B). The elements are fi F {(a, 0), (b, 0)}, fo = {(a, 0), (b, 1)} f3 = {(a, 1), (b, 0)}, 
and f, = {(a, 1), (b, 1)}. Here is a visual representation of these four functions. 


A fi B A ho B 
A fa B A fa B 


Define F: 4B > P(A) by F (f) = {x E A| f(x) = 1}. 
So, F) = Ø, F(f2) = {b}, FG) = {a}, and F(f,) = {a, b}. 
Since P(A) = {9, {a}, {b}, {a, b}, we see that F is a bijection from 4B to P(A). 


ifx EC. 
ifx Ec. 


So, we see that F~*(@) = fı, F-*({b}) = fo, F-*C{a}) = fo, and F~*({a, b} = fy. 


For A + Ø and B = {0, 1}, the function F: 4B > P(A) defined by F(f) = {x E A| f(x) = 1} 
is always a bijection. 


The inverse of F is the function F71: P(A) > 4B defined by F~-1(C)(x) = h 


To see that F is injective, let f, g € 4B with f + g. Since f and g are different, there is some 
a € A such that either f(a) = 0, g(a) = 1 or f(a) = 1, g(a) = 0. Without loss of generality, 
assume that f(a) = 0, g(a) = 1. Since f(a) = 0, a ¢ F (f). Since g(a) = 1, a E F(g). So, 
F(f) + F(g). Since f + g implies F(f) # F(g), F is injective. 


To see that F is surjective, let C € P(A), so that C & A. Define f € 4B by f(x) = T r a 


Then x € F(f) if and only if f(x) = 1 if and only if x € C. So, F(f) = C. Since C € P(A) was 
arbitrary, F is surjective. 


0 ifx€C. 


As in 3, the inverse of F is the function F~!: P(A) > 4B defined by F~1(C)(x) = by ieee 


Notes: (1) See the Note following Theorem 6.6 in Lesson 6 for an explanation of the expression 
“Without loss of generality,” and how to properly use it in a proof. 


(2) As in the note following Example 10.14, using the notation n = {0,1, 2, ...,n — 1}, we have just 
shown that for any nonempty set A, there is a bijection f:42 > P(A). 
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Given functions f: A > B and g: B > C, the composite of f and g, written g ° f: A > C, is defined by 
(g ° f)(a) = g(f(a)) for alla € A. Symbolically, we have 


g°f ={(a,c) E A XC | Thereis ab E B such that (a,b) € f and (b,c) E€ g}. 


We can visualize the composition of two functions f and g as follows. 


(g ° f)(@) 


In the picture above, sets A, B, and C are drawn as different shapes simply to emphasize that they can 
all be different sets. Starting with an arbitrary element a € A, we have an arrow showing a being 
mapped by f to f (a) € B and another arrow showing f (a) being mapped by g to g(f(a)) E C. There 
is also an arrow going directly from a € A to (g ° f)(a) = g(f(a)) in C. Note that the only way we 
know how to get from a to (g ° f )(a) is to first travel from a to f(a), and then to travel from f(a) to 


g(f(a)). 


ifx EQ 


Example 10.17: Define f: Q > Rby f(x) = xvV2 and define g: R > {0,1} by g(x) = {i ifxER\Q 


Then g ° f:Q > {0,1} is defined by (g ° f)(x) = {i Bat a 


To see this, observe that (g ° f)(0) = g(f(0)) = g(0v2) = g(0) = 0 because 0 E Q. If x € Q \ {0}, 
then xV2 ¢ Q because ify = xvV2 € Q, then since Q isa field, V2 = xTty E Q which we know to be 
false. So, (g ° f)(x) = g(f (x)) = g(xv2) =1. 


It will be important to know that when we take the composition of bijective functions, we always get a 
bijective function. We will prove this in two steps. We will first show that the composition of injective 
functions is injective. We will then show that the composition of surjective functions is surjective. 


Theorem 10.3: If f:A > B and g:B > C, then ge f:A9 C. 


Note: We are given that f and g are injections, and we want to show that g ° f is an injection. We can 
show this directly using the definition of injectivity, or we can use the contrapositive of the definition 
of injectivity. Let’s do it both ways. 


Direct proof of Theorem 10.3: Suppose that f: A > B and g: B > C, and let x, y € A with x # y. Since 


f is injective, f(x) + f(y). Since g is injective, g(f (x)) + g(f (y). So, (g ° f)(x) + (g ° f)(y). Since 
x,y E A were arbitrary, ge f:A ° C. Oo 


Contrapositive proof of Theorem 10.3: Suppose that f: A © B and g:B ° C, let x,y E A and suppose 


that (g ° f)(x) = (g ° f)(y). Then g(f(x)) = g(f (y). Since g is injective, f(x) = f(y). Since f is 
injective, x = y. Since x,y E A were arbitrary, go f:A ° C. m 
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Theorem 10.4: If f:A > Band g:B > C, thengeoe f:A > C. 


Proof: Suppose that f:A > B and g:B »> C, and let c € C. Since g surjective, there is b € B with 
g(b) = c. Since f is surjective, there is a E A with f(a) = b. So, (g ° f)(a) = g(f(a)) = g(b) =c. 
Since c € C was arbitrary, g ° f is surjective. oO 


Corollary 10.5: If f: A = Bandg:B =C,thengof:A=C. 


Proof: Suppose that f:A = B and g:B = C. Then f and g are injective. By Theorem 10.3, g of is 
injective. Also, f and g are surjective. By Theorem 10.4, g ° f is surjective. Since g ° f is both injective 
and surjective, g ° f is bijective. Oo 


Note: A corollary is a theorem that follows easily from a theorem or theorems that have already been 
proved. 


If A is any set, then we define the identity function on A, written i,:A > A byi,(a) =a foralla E A. 
Note that the identity function on A is a bijection from A to itself. 


Theorem 10.6: If f:A = B, then f~t o f =i, and fof 71 = ip. 


Proof: Let a € A with f(a) = b. Then f-*(b) = a, and so, (f~t o f)(a) = f *(f(a)) =f 1) =a. 
Since i,(a) = a, we see that (f~+ o f)(a) = i,(a). Since a € A was arbitrary, f 7t o f = ig. 


Now, let b E B. Since f: A = B, there is a unique a E A with f(a) = b. Equivalently, f~1(b) = a. We 
have (f o f-1)(b) = f(f~1(b)) = f(@) =b Since ig(b) =b, we see that (f o f~+)(b) = ig(d). 


Since b € B was arbitrary, f o f~1 = ip. o 


Equinumerosity 
We say that two sets A and B are equinumerous, written A ~ B if there is a bijection f: A = B. 


It is easy to see that ~ is an equivalence relation. For any set A, the identity function i,:A > A is a 
bijection, showing that ~ is reflexive. For sets A and B, if f: A = B, then f~*: B = A, showing that ~ is 
symmetric. For sets A, B, and C, if f: A = B and g: B = C, then g ° f: A = C by Corollary 10.5, showing 
that ~ is transitive. 


Example 10.18: 


1. Let A = {anteater, elephant, giraffe} and B = {apple, banana, orange}. Then A ~ B. We can 
define a bijection f:A=B by f(anteater)= apple, f(elephant) = banana, and 
f (giraffe) = orange. This is not the only bijection from A to B, but we need only find one (or 
prove one exists) to show that the sets are equinumerous. 


2. At this point it should be easy to see that two finite sets are equinumerous if and only if they 
have the same number of elements. It should also be easy to see that a finite set can never be 
equinumerous with an infinite set. 
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3. Let N = {0,1,2,3,4...} be the set of natural numbers and E = {0, 2,4, 6,8... } the set of even 
natural numbers. Then N ~ E. We can actually see a bijection between these two sets just by 
looking at the sets themselves. 


01234 5 6.. 

0246 8 10 12... 
The function f: N > E defined by f (n) = 2n is an explicit bijection. To see that f maps N into 
E, just observe that if n E€ N, then 2n E E by the definition of an even integer (see Lesson 4). f 


is injective because if f(n) = f(m), then 2n = 2m, and so, n = m. Finally, f is surjective 
because if n E E, then there is k E N such that n = 2k. So, f(k) = 2k =n. 


> ifn is even. 
4. N ~ Zvia the bijection f:N = Z defined by f(n) =; n41 
as ifn is odd. 


Let’s look at this correspondence visually: 


012 3 4 5 6... 
0 -1 1-2 2 -3 3.. 


Many students get confused here because they are under the misconception that the integers 
should be written “in order.” However, when checking to see if two sets are equinumerous, we 
do not include any other structure. In other words, we are just trying to “pair up” elements—it 
does not matter how we do so. 


You will be asked to verify that the function f defined above is a bijection in Problem 8 below. 
5. For A any nonempty set, 4{0, 1} ~ P(A). We showed this in part 4 of Example 10.16. 


We say that a set is countable if it is equinumerous with a subset of N. It’s easy to visualize a countable 
set because a bijection from a subset of N to a set A generates a list. For example, the set E can be 
listed as 0, 2, 4, 6, ... and the set Z can be listed as 0, - 1, 1, - 2, 2, ... (see Example 10.18 above). 


There are two kinds of countable sets: finite sets and denumerable sets. We say that a set is 
denumerable if it is countably infinite. 


At this point, you may be asking yourself if all infinite sets are denumerable. If this were the case, then 
we would simply have finite sets and infinite sets, and that would be the end of it. However, there are 
in fact infinite sets that are not denumerable. An infinite set that is not denumerable is uncountable. 


Theorem 10.7 (Cantor’s Theorem): If A is any set, then A is not equinumerous with P(A). 


Analysis: How can we prove that A is not equinumerous with P(A)? Well, we need to show that there 
does not exist a bijection from A to P(A). Recall that a bijection is a function which is both an injection 
and a surjection. So, we will attempt to show that there do not exist any surjections from A to P(A). 
To do this, we will take an arbitrary function f: A > P(A), and then argue that f is not surjective. We 
will show that ran f # P(A) by finding a set B € P(A) \ ran f. In words, we will find a subset of A 
that is not in the range of f. 
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Let’s begin by looking at N, the set of natural numbers. Given a specific function f: N > P(N), it’s not 
too hard to come up with a set B € P(N) \ ran f. Let’s choose a specific such f and use this example 
to try to come up with a procedure for describing the set B. 


f (0) = {0,1, 2, 3,4, 5, 6, 7,8, 9, 10, ...} 
fC) = {0,1,3,4,5, 6,7, 8, 9,10, ...} 
f(2) = {0,1,4,5, 6,7, 8,9, 10,...} 
f(3) = {0,1,4,6, 7,8, 9, 10, ...} 

f (4) = {0,1,4, 6,8, 9,10, ...} 


Technical note: Recall that a prime number is a natural number with exactly two factors, 1 and itself. 
The set of prime numbers looks like this: {2, 3,5, 7,11, 13, 17, ... }. The function f: N > P(N) that we 
chose to use here is defined by f (n) = {k € N | k is not equal to one of the first n prime numbers}. 
Notice how f (0) is just the set N of all natural numbers, f(1) is the set of all natural numbers except 
2 (we left out the first prime), f (2) is the set of all natural numbers except 2 and 3 (we left out the first 
two primes), and so on. Prime numbers will be covered in detail in Lesson 12. 


Observe that the “inputs” of our function are natural numbers, and the “outputs” are sets of natural 
numbers. So, it’s perfectly natural to ask the question “Is n in f(n)?” 


For example, we see that 0 € f(0), 1 e f(1), and 4 € f (A) (indicated in bold in the definition of the 
function above). However, we also see that 2 ¢ f(2) and 3 ¢ f(3). 


Let’s let B be the set of natural numbers n that are not inside their images. Symbolically, we have 


B={nEN|né€é fm}. 


Which natural numbers are in the set B? Well, we already said that 0 € f (0). It follows that 0 ¢ B. 
Similarly, 1 ¢ B and 4 ¢ B, but 2 E B and3 E B. 


Why did we choose to define B this way? The reason is because we are trying to make sure that B 
cannot be equal to f (n) for every n. Since 0 € f (0), but 0 ¢ B, it follows that f (0) and B are different 
sets because they differ by at least one element, namely 0. Similarly, since 1 € f (1), but 1 ¢ B, B 
cannot be equal to f (1). What about 2? Well 2 ¢ f (2), but 2 € B. Therefore, B # f (2) as well... and 
so on down the line. We intentionally chose to make B disagree with f (n) for every natural number n, 
ensuring that B will not be in the range of f. 


| think we are now ready to prove the theorem. 


Proof of Theorem 10.7: Let f:A > P(A), and let B = {a E A |a ¢f(a)}. Suppose toward 
contradiction that B € ran f. Then there is a € A with f(a) = B. But then we have a € B if and only 
if a ¢ f(A) if and only if a ¢ B. This contradiction tells us that B ¢ ran f, and so, f is not surjective. 
Since f: A > P(A) was arbitrary, there does not exist a surjection from A to P(A), and therefore, there 
is no bijection from A to P(A). So, A is not equinumerous with P(A). m 


So, for example, N is not equinumerous with P(N). Which of these two sets is the “bigger” one? Let’s 
consider the function f:N > P(N) defined by f(n) = {n}. This function looks like this: 
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0 1 2 3 4. 

{0} {1} {2} {3} {4}... 
Observe that we are matching up each natural number with a subset of natural numbers (a very simple 
subset consisting of just one natural number) in a way so that different natural numbers get matched 
with different subsets. In other words, we defined an injective function from N to P(N). It seems like 


there are lots of subsets of N that didn’t get mapped to (for example, all infinite subsets of N). So, it 
seems that N is a “smaller” set than P(N). 


We use the notation A < B if there is an injective function from A to B. 
A = B if and only if Aff: A © B) 

We write A < Bif A =< BandA +B. 

So, for example, N < P(N). 


Theorem 10.8: If A is any set, then A < P(A). 


Proof: The function f: A > P(A) defined by f(a) = {a} is injective. So, A < P(A). By Theorem 10.7, 
A + P(A). It follows that A < P(A). o 


Example 10.19: If we let A = P(N), we can apply Theorem 10.8 to this set A to see that 
P(N) < P(P(N)). Continuing in this fashion, we get a sequence of increasingly larger sets. 


N < P(N) < P(P(N)) <P (P(P@)) oe 


If A and B are arbitrary sets, in general it can be difficult to determine if A and B are equinumerous by 
producing a bijection. Luckily, the next theorem provides an easier way. 


Theorem 10.9 (The Cantor-Schroeder-Bernstein Theorem): If A and B are sets such that A < B and 
B<A,thenA~B. 


Note: At first glance, many students think that Theorem 10.9 is obvious and that the proof must be 
trivial. This is not true. The theorem says that if there is an injective function from A to B and another 
injective function from B to A, then there is a bijective function from A to B. This is a deep result, which 
is far from obvious. Constructing a bijection from two arbitrary injections is not an easy thing to do. | 
suggest that the reader takes a few minutes to try to do it, if for no other reason than to convince 
themselves that the proof is difficult. | leave the proof itself as an optional exercise. 


Example 10.20: Let’s use Theorem 10.9 to prove that the open interval of real numbers (0,1) is 
equinumerous to the closed interval of real numbers [0, 1]. 


Analysis: Since (0, 1) S [0,1], there is an obvious injective function f: (0,1) — [0,1] (just send each 
element to itself). 
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The harder direction is finding an injective function g from [0, 1] into (0, 1). 
We will do this by drawing a line segment with endpoints (0, 2) and (1,5). 
This will give us a bijection from [0, 1] to [;. “|. We can visualize this bijection 
using the graph to the right. We will write an equation for this line in the 
slope-intercept form y = mx + b. Here m is the slope of the line and b is 


a 7 1 
the y-intercept of the line. We can use the graph to see that b = 7 and 
3 


i $F os 2 ot . 1 1 
m = = = 44 = £ =- So, we define g: [0, 1] > (0,1) by g(x) =<x +=. 


Let’s write out the details of the proof. 
Proof: Let f: (0,1) > [0,1] be defined by f(x) = x. Clearly, f is injective, so that (0,1) < [0,1]. 


Next, we define g:[0,1] > R by g(x) = =x + f 0<x< 1, then 0< =X < 3 and therefore, 


i< T +Ž< Z Since 0 < “and = < 1, we have 0 < g(x) < 1. Therefore, g: [0,1] > (0,1). Ifx # x’, 


then =X 2 ax", and so, g(x) = =X + Z = ox + Z = g(x'). This shows that g is injective. It follows that 
[0,1] < (0,1). 


Since (0, 1) < [0, 1] and [0, 1] < (0, 1), it follows from the Cantor-Schroeder-Bernstein Theorem that 
(0,1) = [01]: o 


Notes: (1) If A © B, then the function f: A > B defined by f (a) = a for all a € A is always injective. It 
is called the inclusion map. 


(2) It is unfortunate that the same notation is used for points and open intervals. Normally this isn’t an 
issue, but in this particular example both usages of this notation appear. Take another look at the 
analysis above and make sure you can see when the notation (a, b) is being used for a point and when 
it is being used for an open interval. 


(3) We could have used any interval [a, b] with 0 < a < b < 1 in place of 2,3]. 
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Problem Set 10 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 
1. For each set A below, evaluate (i) A”; (ii) P(A); (iii) 4A. 
LA=9 2A={Ø} 3.A={0,1} 4.4=P({0)) 


2. Find all partitions of the three-element set {a, b, c} and the four-element set {a, b, c, d}. 


LEVEL 2 


3. For a,b € N, we will say that a divides b, written a|b, if there is a natural number k such that 
b = ak. Notice that | is a binary relation on N. Prove that (N, | ) is a partially ordered set, but it 
is not a linearly ordered set. 


4. Prove that for each n E€ Z*, =, (see part 3 of Example 10.8) is an equivalence relation on Z. 


5. Let A, B, and C be sets. Prove the following: 
(G) IfACB,thenA<B. 
(ii) x is transitive. 
(iii) < is transitive. 
(iv) IfA =< BandB <C,thenA < C. 
(v) IfA<BandB XC, thenA <C. 


6. Let A and B be sets such that A © B. Prove that P(A) < P(B). 


LEVEL 3 
7. For f,g € PR, define f < g if and only if for all x € R, f(x) < g(x). Is (ER, <) a poset? Is it 
a linearly ordered set? What if we replace 3 by <*, where f <* g if and only if there is an x E€ R 
such that f(x) < g(x)? 


ifn is even 


8. Prove that the function f: N > Z defined by f(n) =} n41 is a bijection. 
-= _ ifnis odd 


9. Define P(N) for each k E N by Pi (N) = N and P;,,(N) = P(P;.(N)) for k > 0. Find a set B 
such that for all k E€ N, P(N) < B. 


10. Prove that if A ~ B and C ~ D, then Ax C ~B XD. 
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LEVEL 4 
11. Define a partition P of N such that P ~ N and for each X E P, X ~N. 
12. Prove that a countable union of countable sets is countable. 
13. Let A and B be sets such that A ~ B. Prove that P(A) ~ P(B). 


14. Prove the following: 
(i) NxNXN. 
(ii) Qw~N. 
(iii) Any two intervals of real numbers are equinumerous (including R itself). 
(iv) NN ~ P(N). 
15. Prove that {A € P(N) | A is infinite} is uncountable. 


16. For f, g € NN, define f <* g if and only if there is n € N such that for all m > n, f(m) < g(m). 
(i) Is (NN, <*) a strict poset? 
(ii) Is (NN, <*) a strict linearly ordered set? 


(iii) Let F = {f,:N > N |n E N} be a countable set of functions. Must there be a function 
g E€ NN such that for all n E N, fa <* 9? 


17. Let P be a partition of a set S. Prove that there is an equivalence relation ~ on S for which the 


elements of P are the equivalence classes of ~. Conversely, if ~ is an equivalence relation on a 
set S, prove that the equivalence classes of ~ form a partition of S. 


LEVEL 5 


18. Prove that if A ~ B and C ~ D, then 4C ~ PD. 
19. Prove that for any sets A, B, and C, PCA ~ ©(PA). 


20. Prove the following: 
(i) P(N) ~ {f eN If isa bijection}. 
(ii) NR + PN, given that R ~ P(N). 


CHALLENGE PROBLEM 


21. Prove the Cantor-Schroeder-Bernstein Theorem. 
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LESSON 11 - ABSTRACT ALGEBRA 
STRUCTURES AND HOMOMORPHISMS 


Structures and Substructures 


An n-ary relation on a set S is a subset of S”. We usually use the expressions unary, binary, and ternary 
in place of 1-ary, 2-ary, and 3-ary. Note that a unary relation on S is simply a subset of S. We do not 
define a O-ary relation. 


Example 11.1: Let Z = {...,- 3,- 2,- 1,0, 1,2,3, ...} be the set of integers. The set N = {0, 1, 2,3, ...} 
of natural numbers is a unary relation on Z. In other words, N © Z. Some examples of binary relations 
on Z are the linear orderings <, <, >, and => (see Example 10.5 (part 2)) and the equivalence relations 
=,= {(a,b) E€ Z? |n|b—a} (see Example 10.8 (part 3)). R={(x,y,z) €Z?|x+y =z} is an 
example of a ternary relation on Z (see Example 10.7). 


An n-ary operation on a set S is a function from S” to S. We also define a 0-ary operation to simply be 
an element of S. We will usually call a O-ary operation a constant in S. 


Example 11.2: Let IR be the set of real numbers. Negation is an example of a unary operation on R. 
This is the operation that maps each x € R to -x. Addition, subtraction, and multiplication are 
examples of binary operations on R. 0 is an example of a 0-ary operation on R or a constant in R. 


A finitary relation is an n-ary relation for some n E N*. A finitary operation is an n-ary operation for 
some n EN. 


A structure is a set together with a collection of finitary operations and relations defined on the set. 
The set is called the domain of the structure. 


Example 11.3: 


1. Semigroups, monoids, and groups are structures of the form (S,x), where S is a set and * is a 
binary operation on S. 


We may want to view a monoid as a structure of the form (S,*, e) and a group as a structure of 
the form (S,x, 7t, e), where e is a constant called the identity element of the monoid or group 
and 7t is the unary inverse operator. 


2. Rings and fields are structures of the form (S, +, -), where S is a set, and + and - are binary 
operations on S. Again, we may want to include additional operations (see part 4 of Example 
11.5 and also part 4 of Example 11.6 below). 


3. Ordered rings and fields are structures of the form (S, +, -,<), where S is a set, + and - are 
binary operations on S, and < is a binary relation on S. 


4. Every set without any operations and relations is a structure. For example, N, Z, Q, R, and C 
are structures. (Notice that we abbreviate the structure (S) as S.) 


5. We can view a vector space (V, @) over the field (F, +, -) as (v UF,V, F, Rg, R4, R, R4), 
where V and F are unary relations, and Rg, R+, R., R, are the following ternary relations: 
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Ra = {œ y, EV?’ |x Qy =z} R; ={@y,z) EF’ |x+y=z} 
R. = {(x,y, z) E F? |x- y =z} R, = {(x,y, Z) EF XV XV | xy = z} 


Notice that we had to use ternary relations instead of binary functions for the four operations 
because the definition of a structure demands that functions be defined on (V U F)?. However, 
none of the functions are defined on (V U F)?, Indeed, @ is defined only on V?, + and - are 
defined only on F?, and scalar multiplication is defined on F x V. 


We will sometimes use a fraktur letter (such as 2, 8, ©) for the name of a structure if we want to be 
clear that we are talking about the whole structure and not just the underlying set. For example, we 
might write © = (G,*) for a group © with underlying set G and group operation x. 


Notes: (1) A finitary operation on a set S is a function f:S" > S for some n E N. There are two 
important facts implied by this definition: 


1. The operation f is defined for every n-tuple (a4, a3, ..., an) E S”. 
2. The set S is closed under f. 


(2) A finitary relation on a set S is a subset R of S” for some n E N. We have more flexibility with 
relations than we do with operations. For example, an (n + 1)-ary relation can be used to define a 
partial n-ary function. Suppose we want a structure that consists of the set of integers Z together with 
the partial function defined on only the even integers that divides each even integer by 2. We can 
define a relation R = {(2k,k) | k € Z}. The structure (Z, R) consists of the set of integers together 
with the function f:2Z > Z defined by f(n) => (2Z is the set of even integers). Notice that we 


defined a unary partial function on Z by using a binary relation. 


We say that structures A and 8 have the same type if they have the same number of n-ary operations 
for each n € N, and the same number of n-ary relations for each n € N* (recall that N* = N \ {0} is 
the set of nonzero natural numbers). 


Example 11.4: 


1. (Q, <), (P(N), S), and for each n E N*, (Z, =ņ) all have the same type because they each have 
exactly one binary relation. 


2. (Z,+) and (Z, +,0) have different types. The first structure has one binary operation and 
nothing else. The second structure has a binary operation and a constant (or a 0-ary operation). 
Both of these are different ways of describing the group of integers under addition. The second 
way is specifically mentioning the identity element, while the first is not. Another structure (of 
yet another type) that describes the same group is (Z, +,-,0), where - is the unary additive 
inverse operator. 


Note: For structures with only finitely many operations and relations, the definition we gave of being 
of the same type is adequate. However, for structures with infinitely many operations and/or relations, 
we should be a little more careful with what we mean by “the same number.” A better definition in 
this case is that for each n E N, the set of n-ary operations in Y is equinumerous with the set of n-ary 
operations in 8, and for each n E N*, the set of n-ary relations in 2 is equinumerous with the set of 
n-ary relations in 8. See Lesson 10 for more information on equinumerosity. 
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A is a substructure of B, written A E if 


1. 


2 
3. 
4 


A and B have the same type. 
ACB. 
If f is an n-ary operation, and (a4, Q2, ..., an) E A”, then fa (a1, Q2, «..,An) = fg (a1, Az, .-, An). 


If R is an n-ary relation, and (a4, az, ..., an) E A”, then R;(a4, Q3, ...,an) if and only if 
Rpg (a1, A2, ..., An) 


Notes: (1) Part 1 of the definition says that in order for 2 to be a substructure of B, the two structures 
must have the same number of n-ary operations and n-ary relations for each n. For example, (N, +) is 
a substructure of (Z, +), written (N, +) © (Z, +), but (N, +) is not a substructure of (Z, +, 0). 


(2) The notation in 3 and 4 might look confusing at first. Let’s clarify with an example of each. Suppose 
that f is addition, so that f(a,,a2) = a, + az. Then 3 says that if A © B and we choose a, and a, 
from A, then we get the same result whether we add a, and a, in A or B. We might write this as 
aı +442 = a,+ga2. Now suppose that R is <, so that R(a,,az) means a; < az. Then 4 says that if 
A E Band we choose a, and a, from A, then a, <4 az if and only if a, <p ap. 


Example 11.5: 


1. 


Let (S,*) be a semigroup. A substructure (T,*) of (S,x) is called a subsemigroup. Notice that 
T © S and the operation * must be the same for both structures. Also, x is a binary operation 
on T, which means that T is closed under x. Is x associative in T? Recall from Note 2 following 
Example 3.3 in Lesson 3 that associativity is closed downwards. In other words, since * is 
associative in S and T CS, it follows that * is associative in T. We just showed that a 
subsemigroup of a semigroup is itself a semigroup. 


For example, let X = (N, +) and let B = (E, +), where E = {2k | k € N} is the set of even 
natural numbers. Then B © A. That is, B is a subsemigroup of 2. 


On the other hand, if we let O = {2k + 1|k € N}, then (O, +) is not even a structure because 
+ is not a binary operation on O. For example, 3,5 E€ O, but3 +5 ¢ O. 


Let (M,x,e) be a monoid, where e is the identity of M. A substructure (N,x,e) of (M,x, e) is 
called a submonoid. Notice that the operation * and the identity e must be the same for both 
structures. As we saw in 1 above, N is closed under * and x is associative in N. We just showed 
that a submonoid of a monoid is itself a monoid. 


Note that a substructure (N,*) of a monoid (M,x) is a subsemigroup of (M,*), but may or may 
not be a submonoid of (M,x). For example, let C = N \ {0,1} = {2, 3, 4, ...} be the set of 
natural numbers with 0 and 1 removed. Then (C, -) is a subsemigroup of the monoid (N, >), 
but (C, -) is not a submonoid of (N, -) because C is missing the multiplicative identity 1. 


If (M,x) is a monoid with identity e, we can define a submonoid to be a substructure (N,*) of 
(M,x) such that N contains e. In other words, if we wish to leave the identity out of the 
structure, we need to explicitly mention that the domain of the substructure contains the 
identity in order to guarantee that we get a submonoid. For example, if we let W = (N, +) and 
B = (E, +), we see that B is a submonoid of 2 because E E N is closed under + and 0 E E. 
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3. Let (G,*, 7t, e) be a group, where ~! is the unary inverse operator and e is the identity of G. A 
substructure (H,x, ~1,e) of (G,*, ~1,e) is called a subgroup. Notice that the operations * and 
~1 and the identity e must be the same for both structures. As we saw in 1 and 2 above, H is 
closed under * and x is associative in N. By making the unary inverse operator part of the 
structure, we have guaranteed that the inverse property holds for the substructure. So, a 
subgroup of a group is itself a group. 


Also note that if x is commutative in G, then * is commutative in H. Commutativity is closed 
downwards for the same reason that associativity is closed downwards (once again, see Note 2 
following Example 3.3 in Lesson 3). 


For example, let A = (Z, +,-, 0) and let B = (2Z,+,-,0), where 2Z = {2k | k € Z} is the set 
of even integers. Then $ is a subgroup of AX. More generally, for any positive integer n, we can 
let nZ = {nk | k € Z}. The structure (nZ, +,-,0) is a subgroup of the group (Z, +,-,0). 


Note that a substructure (H,*) of a group (G,x) is a subsemigroup of (G,*), but may or may not 
be a subgroup of (G,*), as we saw in 2 above. Furthermore, a substructure (H,x, e) of a group 
(G,x,e) is a submonoid of (G,x,e) but still may not be a subgroup of (G,*,e). For example, 
(N,+,0) is a substructure of the group (Z,+,0) that is not a subgroup of (Z,+,0) (it is a 
submonoid though). We need to include the unary inverse operator in the structure to 
guarantee that a substructure of a subgroup will be a subgroup. 


If (G,x) is a group with identity e, we can define a subgroup to be a substructure (H,*) of 
(G,x) such that H contains e and for all x € H, x~+ € H (in other words, we need to insist that 
H is closed under taking inverses). These conditions can be used in place of including symbols 
for inverse and identity in the structure itself. For example, if we let 2 = (R*, -) and 
B = (Q*, -), we see that B is a subgroup of N because Q* € R*, 1 E Q*, and Q is closed 
under taking multiplicative inverses. 


If the operation is understood, we can simplify notation even further. We may write H < G and 
say that H is a subgroup of G. What we mean by this is (H,*, —1,e) is a substructure of 
(G,x, 71, e), or equivalently, (H,*) is a substructure of (G,*) such that the identity of G is in H 
and H is closed under taking inverses. 


We use the same notation for other structures as well. Just be careful about one thing. When 
we write A < B, we don’t just mean that the structure Y is a substructure of the structure B. 
We also mean that the structure has all the properties we need for the type of structure 
under discussion. For example, if we are talking about groups under addition, then we would 
not write N < Z. However, if we are talking about monoids under addition, then we could write 
N <Z. 


4. Let (R,+, -,-,1) be a ring, where - is the unary additive inverse operator and 1 is the 
multiplicative identity of R. A substructure (S, +, -,-,1) of (R,+, -,-,1) is called a subring. 
Notice that the operations +, -, and -, and the multiplicative identity 1 must be the same for 
both structures. By the definition of a structure, S is closed under +, : and -. 
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You may be wondering why we didn’t put a constant for 0 in the structure. The reason is 
because we don’t need to. Since 1 € S and S is closed under the additive inverse, we have 
0=1+(-1) ES. Associativity of addition and multiplication, commutativity of addition, and 
distributivity all hold in S because these operations are closed downwards (see Note 2 following 
Example 3.3 in Lesson 3). It follows that a subring is itself a ring. 


Alternatively, we can say that (S, +, -) is a subring of (R, +, -) if (S,+, -) is a substructure of 
(R, +, -) such that S contains 1 and for all x E€ S, -x E S (in other words, we need to insist that 
S is closed under taking additive inverses). 


As we discussed above, we may write S < R for S is a subring of R if it is clear that we are talking 
about the ring structures of S and R. 


For example, (Z, +, -) is a subring of the fields (Q, +, -), (R, +, -), and (C, +, >). 


(Z, +, -) has no subring other than itself. To see this, let A < Z. First note that the multiplicative 
identity 1 € A. Using closure of addition and the principle of mathematical induction, we can 
then show that each positive integer is in A (for example, 2 = 1 + 1). Since A is closed under 
the additive inverse of Z, for each positive integer n, -n E A. It follows that A = Z. (Note that 
we know that 0 € A because we have already shown that 0 is in any subring of a ring.) 


Let (F,+, ,-, 1,0,1) be a field, where - and ~t are the unary additive inverse and 
multiplicative inverse operators, respectively, and 0 and 1 are the additive and multiplicative 
identities of R, respectively. Note that technically speaking, 7t must be expressed as the binary 
relation “t = {(x,y) |y = x71} because ~! isn’t defined for x =0. A substructure 
(K,+, „-, ~1,0,1) of (F,+, «-, ~4,0,1) is a subfield provided that the domain and range of 
the multiplicative inverse relation 7t are both K*. Notice that the operations +, -,-, the 
relation 7t, and the identities 0 and 1 must be the same for both structures. By the definition 
of a structure, K is closed under +, -, and -. Associativity and commutativity of addition and 
multiplication, and distributivity all hold in K because these operations are closed downwards 
(see Note 2 following Example 3.3 in Lesson 3). It follows that a subfield is itself a field. 


Alternatively, we can say that (K, +, -) is a subfield of (F, +, -) if (K, +, -) is a substructure of 
(F,+, -) such that K contains 0 and 1, for all x E K, —x E€ K and for all nonzero x E K, 
xt € K (in other words, we need to insist that K is closed under taking additive inverses and 
K* is closed under taking multiplicative inverses). We will write K < F when K is a subfield of 
F and it is clear we are talking about the field structures of K and F. 


For example, (Q, +, -) is a subfield of both (R, +, -) and (C, +, -), and (R, +, -) is a subfield 
of (C, +, -). 


If (P, <) is a partially ordered set, then a substructure (Q, <) of (P, <) will also be a partially 
ordered set. This is because reflexivity, antisymmetry, and transitivity are all closed downwards. 
Once again, see Note 2 following Example 3.3 in Lesson 3 for an explanation of this. Similarly, 
any substructure of a linearly ordered set is linearly ordered, and similar results hold for strict 
partial and linear orders. 


For example, we have (N, <) € (Z, <) E (Q, <) E (R, < ), and each of these structures are 
linearly ordered sets. Similarly, we have (N, <) E (Z,<) E (Q,< ) E (R, <). 
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Homomorphisms 


A homomorphism is a function from one structure to another structure of the same type that preserves 
all the relations and functions of the structure (see the Note after Example 11.6 for a more rigorous 
definition). 


Example 11.6: 


1. Let (S,x) and (T,°) be semigroups. A semigroup homomorphism is a function f: S > T such 
that for alla,b E S, f(a x b) = f(a) ° f(b). 


For example, let A = (Z*, +), B = (E, -), and let f:Z* > E be defined by f (n) = 2”. For all 
n,m E Z*, we have f (n + m) = 2"*™ = 2”. 2™ = f(n) - f(m). Therefore, f is a semigroup 
homomorphism. 


As another example, let A = (N, +), 8 = ({T, F}, V), and let g:N > {T,F} be defined by 
g(n) = T. Foralln,m E N, we have gin+m) =T=TVT=4Q(n)V g(m). Therefore, g isa 
semigroup homomorphism. 


2. Let (M,x,ey) and (N,°,ey) be monoids, where ey and ey are the identities of M and N, 

respectively. A monoid homomorphism is a function f: M > N such that for all a,b E M, 
f(a b) = f(a) ° f(b) and f (ey) = en. 
Note that we need to include the identity element of a monoid as part of the structure for a 
homomorphism to be a monoid homomorphism. Otherwise we get only a semigroup 
homomorphism. The second example in part 1 above is a semigroup homomorphism, but not 
a monoid homomorphism. Indeed, the identity of (N, +) is 0 and the identity of ({T, F}, V) is 
F, but g(0) =T +F. 


On the other hand, if we change the domains of the structures in the first example from part 1 
above slightly, we do get a monoid homomorphism. Let A = (N, +, 0), B = (N, -,1), and let 
f:N >N be defined by f(n) = 2”. For all nım E N, f(n +m) = f(n): f(m), as we saw 
above, and f (0) = 2° = 1. Therefore, f is a monoid homomorphism. 


3. Let (G,*) and (H,°) be groups. A group homomorphism is a function f: G > H such that for all 
a,b E€ G, f(a x b) = f(a) ° f(b). 
You may be asking why we are not including constant symbols for the identity like we did for 
monoids. After all, we certainly want f to take the identity of G to the identity of H. And you 
may also be asking why we are not including a unary operator symbol for taking the inverse, as 
we certainly want f (a71) = (FaN. For structures (G,*, “t6 ,eç) and (H,o, 71H , ey), we 
can define a group homomorphism to be a function f:G > H such that for all a,b EG, 
f(ax*b) = f(a) ° f(b), for alla € G, f(a~+) = (f(a), and f (eç) = ey. However, it turns 
out that this more complicated definition is equivalent to our first simpler one. In other words, 
if f:G —> H is a group homomorphism using the simpler definition, then f already maps the 
identity of G to the identity of H, and f already preserves inverses. We will prove these facts in 
Theorems 11.1 and 11.2 below. 
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As an example, let A = (Z,+), B = ({1,-1}, -), and let f:Z— {1,-1} be defined by 
f(n) = { 1 ifnis even. 

-1  ifnisodd. 
n +m is even, and so, f(n +m) = 1 and f(n): f(m) = 1-1 = 1. If n and m are both odd, 


thenn + mis even, and so, f(n + m) = 1and f(n) - f(m) = (-1)- (-1) = 1. Ifn is even and 
m is odd, thenn + mis odd, and so, f (n + m) =-1and f (n) - f(m) = 1 - (- 1) = - 1. Finally, 
if n is odd and m is even, then n+m is odd, and so, we have f(n+m)=-1 and 
f(n): f(m) =-1-1=-1. Therefore, f is a group homomorphism. 


There are four cases to consider. If n and m are both even, then 


Let’s look at another example. Let 2 = (R, +), B = (R, +), and let g: R > R be defined by 
g(x) = x?. Then g is not a group homomorphism. To see this, we just need a single 
counterexample. We have g(1) = 1? = 1, g(2) = 27 = 4, g(1 + 2) = g(3) = 3? = 9, and 
g(1) + g(2) = 1+4 = 5. Since g(1 + 2) + g(1) + g(2), g fails to be a homomorphism. 


Let (R, +p, ‘r, 1p) and (S, +s, -s,15) be rings, where 1p and 1, are the multiplicative identities 
of R and S, respectively. A ring homomorphism is a function f: R > S such that for alla,b E R, 
f(atrb) = f (a)+sf (b), f (a ‘r b) = f(a) ‘s f(b), and f(r) = 1s. 

Notice that we did not include constant symbols for the additive identities of the rings and we 
did not include unary operator symbols for taking the additive inverses of elements in the rings. 
We will see in Theorems 11.1 and 11.2 below that with f defined as above, it follows that for 
alla E R, f(-a) =- f(a), and f (0R) = 0s. 


Let’s look at an example. First note that if R is a ring, then R x R with addition and multiplication 
defined componentwise is also a ring. That is, for a,b,c,d € R, we define addition and 
multiplication by (a, b) + (c,d) = (a + c,b + d) and (a, b)(c,d) = (ac, bd). The verification 
that R x R is a ring with these definitions is straightforward (see Problem 5 below). Let 
A = (Z x Z,+, -,(1, 1)), B = (Z, +, -,1), and let f:Z x Z > Z be defined by f((,m)) =n. 
Then for all n,m,j,k E Z, we have fnm) + (j, k)) = f((n +j,m+ k)) =n+j and 
f((,m))+f(G.k)) =n+j. We also have f((n,m)-(,k)) = f((nj,mk)) =nj and 
f((n, m)) FG, k)) = nj. Finally, f(G1)) = 1. Therefore, f is a ring homomorphism. 


Let’s look at another example. Let A = B = (Z, +, -,1), and let g:Z > Z be defined by 
g(n) = 2n. Then g is not a ring homomorphism. To see this, we just need a single 
counterexample. g(3) = 2 -3 = 6, g(5) = 2-5 = 10, g(3:5) = g(15) = 2-15 = 30, and 
g(3): g(5) = 6-10 = 60. Since g(3- 5) + g(3) - g(5), g fails to be a ring homomorphism. 
Note, however, that g is a group homomorphism from (Z, +) to itself. Indeed, ifn,m € Z, then 
g(n+m) = 2(n+m) = 2n+2m= g(n) + gm). 

. A field homomorphism is the same as a ring homomorphism. The multiplicative inverse is 


automatically preserved (see Theorem 11.2 below), and so, nothing additional needs to be 
added to the definition. 


Let (A, <4) and (B, <,) be partially ordered sets. An order homomorphism (also known as a 
monotonic function) is a function f: A — B such that for all x,y E€ A, x <4 y if and only if 
fœ) Se fO). 

For example, let A = B = (N, <) and let f:N > N be defined by f(n) =n +3. For all 
n,m EN, we have n < m if and only ifn +3 < m + 3 if and only if f(n) < f(m). Therefore, 
f is an order homomorphism. 
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As another example, let A = (Z, >), B = (P(Z),S), and let g:Z > P(Z) be defined by 
g(n) ={k €Z|n<k}. Letm,n € Z. We will show that m = n if and only if the relationship 
{fkEZ|m<k}C{k €Z|n<k} holds. Suppose that m > n and let je {kEZ|m < k}. 
Then j2=m. Since mən, jan, and so, jEe{keZ|n<k}. Now, let 
{kEZ|m<k}C{keEZ|n<k}. Since m<m, we have mEe{kEZ|m<k}. So, 
mé€{keEZ|n<k}. Thus, n<m, or equivalently, m > n. Therefore, g is an order 
homomorphism. 


Note: Here is a more rigorous definition of a homomorphism. 


If A and B are structures of the same type with underlying domains A and B, then a homomorphism 
is a function f: A > B such that for eachn E N, 


1. if Ris an n-ary relation, then R4(@4, a2, ..., an) if and only if Ra(f (a), f (a2), gt (yd): 
2. If F is an n-ary function, then f (Fala, Az, sage) | = Fa (f (ax), f (az), iat (Oe Ve 


In particular, 2 implies that if c is a constant, then f (c4) = Cg. 


Theorem 11.1: Let (G,*) and (H,°) be groups with identities eg and ep, respectively, and let 
f:G > H bea group homomorphism. Then f (eç) = ey. 


Proof: Since eç = eg * eg, we have f (eg) = f (eg * eg) = f (eg) ° f (eg). So, 
f (ea) = Flea) ° en = Flea) © (Flea) © (F(ec)) °) 
= (f (ec) ° f(ec)) ° (Flee) = f lec) ° (flee) = en. o 


Notes: (1) The computations in the proof take place in the group (H,°). In particular, f (eç) € H and 
ey E H. If the proof seems confusing because f (eç) appears so often, try making the substitutions 
h = f (eç) and e = ey. Notice that h,e € H and by the first line of the proof, h = ho h. The rest of the 
proof then looks like this: 


h=heoe=ho(hoh™") =(hoh)oh*=hoh*=e. 


Remember that h = f (eg) and e = ey. So, we have f (eç) = ey, as desired. 
(2) h = hoe because e is the identity for H. 


(3) e = ho h™t by the definition of inverse and because e is the identity for H. From this equation, it 
follows that h o e =ho(hoh"*). 


(4) h o (heh) = (h 0 h) oh“? because o is associative in H. 


(5) h o h = h from the first line of the proof (this is equivalent to f (eg) ° f (eg) = f(eg)). It follows 
that (ho h) o h™t = h o hat. 


(6) Finally, h o h™t = e, again by the definition of inverse and because e is the identity for H. 


(7) If the group operation is addition, then we usually use the symbols Oç and 0y for the identities. 
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Theorem 11.2: Let (G,*) and (H,°) be groups and let f: G —> H be a group homomorphism. Then for 
all g € G, f(g") = (F(g)) 
Proof: By Theorem 11.1, we have f (eg) = ey. So, for g E G, we have 
en = f (ec) = f(g * 97°) = f(g) ° f(g”). 
since f(g) ° f(g") = en, f9) = (F@) a 


Notes: (1) eç = g * g71 by the definition of inverse and because eç is the identity for G. From this 
equation, it follows that f(eg) = f(g * g~*). 


(2) f(g * g-*) = f(g) ° f(g~*) because f is a homomorphism. 


(3) In a group with identity e, if xy = e and yx = e, then y = x71. We actually need to verify only one 
of the equations xy = e or yx = e to determine that y = x71 (see Note 6 after the solution to Problem 
7 in Problem Set 3 from Lesson 3). Letting x = f(g), y = f(g~*), and e = ey, we showed in the proof 


that xy = e. It follows that y = x~*. That is, f(g") = (f(g)) 


An isomorphism is a bijective homomorphism. If there is an isomorphism from a structure % to a 
structure 8, then we say that 2 and 8 are isomorphic, and we write A = B. Mathematicians generally 
consider isomorphic structures to be the same. Indeed, they behave identically. The only difference 
between them is the “names” of the elements. 


Example 11.7: 


1. For n E Z*, the function f:Z > nZ defined by f(k) = nk is an isomorphism between the 
groups (Z, +) and (nZ, +). It’s easy to see that f is injective (j # k > nj + nk) and surjective 
(ifnk E€ nZ, then f(k) = nk). If j,k E€ Z, then fG +k) =nG +k) =nj+nk =fG) +f). 
It follows that (Z, +) = (nZ, +). 


Note that this map is not a ring isomorphism for n > 1. First, (nZ, +, -) is technically not even 
a ring for n > 1 because 1 ¢ nZ. But it is “almost a ring.” In fact, the multiplicative identity 
property is the only property that fails. See the notes following Theorem 11.4 for more details. 


Let’s show that for n > 1, f is not an isomorphism between the “almost rings” (Z, +, +) and 
(nZ, +, -). Let’s use 2,3 E Z to provide a counterexample: f(2-3) = f(6) =n-6 = 6n and 
f(2)-f(3) = (n- 2)(n- 3) = 6n?. If f(2- 3) = f(2) - f (3), then 6n = 6n?, so that n = n?. 
This equation is equivalent to n? — n = 0, or n(n — 1) = 0. Son = Oorn = 1. 


In fact, as “almost rings,” (Z, +, -) is not isomorphic to (nZ, +, -) atallforn > 1. If f:Z > nZ 
were an isomorphism, then f (1) = nm for some m € Z. But also, since f is a homomorphism, 
fA =f4-1)=f(fA) = (nm) (nm) = n?m?. So, nm = n?m?, and thus, m = 0,n = 0, 
or 1 = nm. If m = 0, then f(1) = 0, and so, f(2) = fA +1)=f)+f)=0+0=0. 
So, f is not injective. Since n > 1,n + Oand 1 + nm. 


2. Recall that if z = a + bi is a complex number, then the conjugate of z is the complex number 
Z = a — bi. The function g: C > C defined by f(z) = Z is an isomorphism between the field 
(C, +, -) and itself. By Problem 3 (parts (iii) and (iv)) from Problem Set 7 in Lesson 7, we have 
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fiz@tw)=z+w=Zz+wef(z)t+fw) f(zw)=mw=7-w=f@fw) fd)=1=1 


Thus, f is a homomorphism. Since for all z € C, f(z) = z, f is surjective. Since z + w implies 
that Z +w, f is injective. Therefore, f is a bijective homomorphism, and so, f is an 
isomorphism. 


An isomorphism from a structure to itself is called an automorphism. The identity function is 
always an automorphism from any structure to itself. In the previous example, we described a 
nontrivial automorphism from C to C. 


Images and Kernels 


Let f:A —> B be a homomorphism. The image of f is the set f [A] = {f (x) | x € A} and the kernel of 
f isthe set ker(f) = {x € A | f(x) = eg}. In the case where B has both an additive and multiplicative 
identity, then eg will always be the additive identity (in other words, if 0,1 E€ B, then the kernel of f is 
the set of all elements of A that map to 0). 


Theorem 11.3: Let f: R > S be a ring homomorphism. Then f[R] is a subring of S. 


Proof: Since f(x) + f(y) = f(x + y) and f(x)f(y) = f(xy), we see that f[R] is closed under 
addition and multiplication. Since 1; = f (1r), 1; E€ f[R]. By Theorem 11.2, - f(x) = f(-x) (this is 
the conclusion of Theorem 11.2 when additive notation is used). So, for each element f(x) € f[R], 
- f(x) € f[R]. It follows that f[R] is a subring of S. Oo 


Note: The same result holds if we replace “ring” by semigroup, monoid, group, or field. If (S,*) and 
(T,°) are semigroups, and f:S > T is a semigroup homomorphism, then f(x) ° f(y) = f(x * y) 
shows that f [S] is closed under o, and therefore, f [S] is a subsemigroup of T. 


Furthermore, if (M,x) and (N,°) are monoids, and f:M —> N is a monoid homomorphism, then by 
definition, f (ey) = ey, and therefore, f [M] is a submonoid of N. 


If (G,x) and (H,°) are groups, and f:G > H is a group homomorphism, then f (eg) = ey by Theorem 
11.1, and forall g E G, (f(g) = f (g~t) by Theorem 11.2. Therefore, f [G] is a subgroup of H. 


If (F,+, -) and (K,+, -) are fields, and f: F > K is a field homomorphism, then for all x € F*, 
(F) = f (x71) by Theorem 11.2 again. Therefore, f [F] is a subfield of K. 


Theorem 11.4: Let f:G —> H be a group homomorphism. Then ker(f) is a subgroup of G. 


Proof: Let x, y E ker(f). Then f(x) = ey and f (y) = ey . So f(x x y) = f(x) ° f(y) = ey ° ey = ep. 
Thus, x x y € ker(f). Since f (eç) = ey (by Theorem 11.1), eg E ker(f). Suppose x € ker(f). By 


Theorem 11.2, we have f(x7') = (F) = e,,1 = ey. So x~! € ker(f). Therefore, ker(f) is a 
subgroup of G. m 


Notes: (1) The same result holds for semigroups and monoids. This should be clear from the proof. 
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(2) Let’s say that (R,+, -) is almost a ring if all the ring properties hold except the existence of a 
multiplicative identity. Similarly, we will say that (S, +, -) is almost a subring of the ring (R, +, -) if all 
the properties of being a subring hold except S does not contain the multiplicative identity. 


“aon 
l 


In this case, some authors use the word “rng.” They intentionally leave out the in ring to help 
remember that this structure has no multiplicative identity. In other words, 1 is missing from a rng. 


(3) If f: R > S is a ring homomorphism, then unless S is the trivial ring {0}, ker(f) is not a ring because 
f(g) = 1s = Os. So, 1p ¢ ker(f). However, every other property holds and so ker(f) is almost a 
subring of R. Indeed, if x, y E€ ker(f), then 


fœ +y) =f) + fly) = 0s + 0s = Os and f(xy) = f(x) f(y) = 0s: Os = Os. 


Also, f (Og) = Os by Theorem 11.1, and if x E ker(f), then f(-x) =- f(x) =-05 = 0; by Theorem 
11.2 (this is the conclusion of Theorem 11.2 when additive notation is used). 


(4) Some authors exclude the existence of a multiplicative identity from the definition of a ring. Note 3 
gives a good justification for doing so. However, removing a property creates other complexities. So, 
there is no right or wrong answer here. For us, rings will always include a multiplicative identity. If we 
wish to exclude the multiplicative identity, we will call the structure “almost a ring.” 


Theorem 11.5: Let f: G —> H be a group homomorphism. Then ker(f) = {eç} if and only f is injective. 


Proof: Suppose that ker(f) = {eg}, let x,y € G, and let f(x) = f(y). Then FAFO = ey. It 


follows from Theorem 11.2 that f(xy7') = f(x) f(y~4) = FOFO) = e,. So, xy"! € ker(f). 
Since ker(f) = {eç}, xy~+ = eç. Therefore, x = y. Since x, y € G were arbitrary, f is injective. 


Conversely, suppose that f is injective, and let x € ker(f). Then f(x) = ey. But also, by Theorem 11.1, 
f (eg) = ey. So, f(x) = f (eg). Since f is injective, x = eg. Since x E G was arbitrary, ker(f) © {eg}. 
By Theorem 11.1, f (eç) = ey, so that eg E ker(f), and therefore, {eç} © ker(f). It follows that 
ker(f) = {ec}. o 


Note: The theorem also holds for ring homomorphisms. Specifically, if f:R >S is a ring 
homomorphism, then ker(f) = {0p} if and only if f is injective. The proof is the same, except additive 
notation should be used. Here is a sketch of the proof using additive notation: 


If ker(f) = {0r} and f(x) = f(y), then f(x + (-y)) = fŒ) + fy) = f(x) — fO) = 0s, so that 
x + (-y) E ker(f), and thus, x + (- y) = Op, and so, x = y. 


Conversely, if f is injective and x E€ ker(f), then f(x) = 0s. Since f (0R) = 05 and f is injective, we 
have x = Op. So, ker(f) S {0p}. Also, f (0R) = 05. So, Op E ker(f), and therefore, {0} S ker(f). 


Normal Subgroups and Ring Ideals 


Let (G,*) be a group and h,k € G. We say that k is a conjugate of h if there is a g E€ G such that 
k = ghg™' (as usual, we abbreviate g x h x g™ as ghg™'). 
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If (G,*) is a group, we say that a subgroup N of G is normal, and write N < G, if whenever h E N and 
k E G is a conjugate of h, then k E N. (In this case, we may say that N is closed under conjugation.) 


Example 11.8: 

1. If G isa commutative group, then every subgroup H of G is normal. Indeed, ifh E€ H andg EG, 
then ghg-1 = hgg = he =heE H. 

2. If f:G > H is a group homomorphism, then ker(f) is a normal subgroup of G. We already 
showed in Theorem 11.4 that ker(f ) is a subgroup of G. To see that ker(f) < G, leth € ker(f) 

"e _ -1 -1 

and let g E G. Then f(ghg"*) = f(g) f(g") = fef) =O) =e. 

3. Any group is a normal subgroup of itself. Indeed, if h € G and g E G, then clearly ghg™t € G. 

4. The trivial subgroup of a group G consisting of just the identity e is a normal subgroup of G. 
Indeed, if h E€ {e} and g E G, then ghg"1 = geg™t = gg 1 =e E {e}. 

5. Let A be a nonempty set. A bijection from A to itself is called a permutation of A. Let S(A) be 


the set of permutations of A. Let’s check that (S(A),°) is a group, where o is the operation of 
composition. 


By Corollary 10.5 from Lesson 10, S(A) is closed under o. 
To see that © is associative, let f,g,h E S(A) and leta E A. Then 


(F © 9) ° h)a) = F ° (hA) = f (g(h(@))) = F((g ° h) = (F ° (gh) (a). 
Since a € A was arbitrary, (f ° g) ° h = f ° (g ° h). So, o is associative in S(A). 
Recall that the identity permutation i, is defined by i,(a) = a for all a E A. If a E A, then 


(i, ° f)(a) = ia(f(a)) = f(a) = f(i,(@)) = (f °i,)(a). Since a € A was arbitrary, we have 
iof =fandf ci, =f. 

Recall that for any permutation f on A, there is an inverse permutation f~t satisfying 
f-lcof =f ef" =i, for each f € S(A) (by Theorem 10.6). 

So, we have verified that (S(A),°) is a group. 

If A = {1,2,...,n}, then we define S, to be S(A). For example, S3 = S({1, 2, 3}). We can 


visualize each element of S3 with a cycle diagram. Here are the six elements of S3 visualized 
this way. 


(1) (12) (13) (23) (123) (132) 
1 2 1 2 1 2 1 2 1 2 1 2 
© e ee e a © © 7 e ee ee 

e e (o e Le eJ 
3 3 3 3 3 3 


The first diagram represents the identity permutation {(1, 1), (2, 2), (3,3)}, where each 
element is being mapped to itself. Technically, we should have an arrow from each point looping 
back to itself. However, to avoid unnecessary clutter, we leave out arrows for elements that are 
mapping to themselves. In cycle notation, we have (1)(2)(3), which we abbreviate as (1). 
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The second diagram represents the permutation {(1,2),(2,1),(3,3)}, where 1 is being 
mapped to 2, 2 is being mapped to 1, and 3 is being mapped to itself. Again, we leave out the 
arrow from 3 to itself to avoid clutter, and we just put in the arrows from 1 to 2 and from 2 to 
1. In cycle notation, we have (12)(3), which we abbreviate as (12). In this notation, (12) 
represents a cycle. The cycle moves from left to right and the last element in the cycle connects 
to the first. So, 1 maps to 2 and 2 maps to 1. Any element that does not appear in the cycle 
notation maps to itself. 


As one more example, in the cycle (123), 1 maps to 2, 2 maps to 3, and 3 maps to 1. 


To compose two permutations in cycle notation, we write the one we want to apply first on the 
right (just as we do in function notation). For example, let’s simplify (12)(13). Starting with 1, 
we see that the rightmost cycle sends 1 to 3. The leftmost cycle sends 3 to itself, and so the 
composition sends 1 to 3. Let’s do 2 next. The rightmost cycle sends 2 to itself, and then the 
leftmost cycle sends 2 to 1. So, the composition sends 2 to 1. And finally, let’s look at 3. The 
rightmost cycle sends 3 to 1, and then the leftmost cycle sends 1 to 2. So, the composition 
sends 3 to 2. It follows that (12)(13) = (132). 


Observe that the group (S3,°) is not commutative. For example, (12)(13) = (132), whereas 
(13)(12) = (123). 


Let’s consider the subgroups H = {(1), (123), (132)} and K = {(1), (12)}. One of these is a 
normal subgroup of S} and the other is not. You will be asked to verify that H and K are 
subgroups of S3 and to determine which one is normal and which one is not in Problem 2 below. 


Let (R, +, -) be a ring and let A © R. We say that A absorbs R if for everya E A andx ER, ax EA 
and xa € A. 


Note: Since in a ring, multiplication is not necessarily commutative, both conditions ax € A and 
xa E A may be necessary. In a commutative ring, either condition follows from the other. 


If (R, +, -) isa ring, we say that a subset I of R is an ideal of R, and write I < R, if (I, +) is a subgroup 
of (R, +) and I absorbs R. 
Example 11.9: 


1. Consider the ring (Z, +, -). Then (2Z,+, -) is an ideal of Z because (2Z, +) is a subgroup of 
(Z, +) (see part 3 of Example 11.5) and when we multiply an even integer by any other integer, 
we get an even integer (so, 2Z absorbs Z). 


More generally, for each n E Z*, (nZ, +, -) is an ideal of (Z, +, -). 


2. If f:R > S isa ring homomorphism, then ker(f) is an ideal of G. We already showed in Note 3 
following Theorem 11.4 that (ker(f) , +) is a subgroup of (R, +). To see that ker(f) absorbs 
R, let a€ker(f) and let x ER. Then f(ax) = f(a)f(x) = 05: f(x) = 05, so that 
ax E ker(f). Also, f (xa) = f(x) f(a) = f(x) - 0; = 05, so that xa E ker(f). 


3. Any ring is an ideal of itself. Indeed, if a E R and x E R, then clearly ax E€ R and xa E R. 


4. {Op} is an ideal of R because for all x € R, OR -x = Og and x : Og = Op. 
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Problem Set 11 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 


1. 


2; 


Write the elements of S4 in cycle notation. 


Draw a group multiplication table for S4. Let H = {(1), (123), (132)} and K = {(1), (12)}. 
Show that H and K are subgroups of S} and determine which of these is a normal subgroup of 
S3. 


LEVEL 2 


3. 


A Gaussian integer is a complex number of the form a + bi, where a, b € Z. Let Z[i] be the set 
of Gaussian integers. Prove that (Z[i],+, +) is a subring of (C, +, -). 


Let (G,*) be a group with H a nonempty subset of G. Prove that (H,*) is a subgroup of (G,*) if 
and only if for all g,h EH, g*h' EH. 


Let (R, +, -) be a ring and define addition and multiplication on R x R componentwise, as was 
done in part 4 of Example 11.6. Prove that (R x R, +, -) is aring and that (R, +, -) is isomorphic 
to a subring of (R x R, +, >). 


LEVEL 3 


6. 


7. 


10. 


Prove that there are exactly two ring homomorphisms from Z to itself. 


Prove the following: 
(i) | Ring isomorphism is an equivalence relation. 
(ii) If we let Aut(R) be the set of automorphisms of a ring R, then (Aut(R), °) is a group, 
where o is composition. 
Let G be a group with H and K subgroups of G, and let G = H UK. Prove that H = Gor K =G. 
Prove that a commutative ring R is a field if and only if the only ideals of R are {0} and R. 


Prove that if X is a nonempty set of normal subgroups of a group G then NX is a normal subgroup 
of G. Similarly, prove that if X is a nonempty set of ideals of a ring R, then NX is an ideal of R. 
Is the union of normal subgroups always a normal subgroup? Is the union of ideals always an 
ideal? 
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11. 


Let Z,[x] = {anx" + Any x” 1 ++ +.a,X + Ap | ao, Ay, ..-,an E Z}. In other words, Z,[x] 
consists of all polynomials of degree at most n. Prove that (Z,[x], +) is a commutative group 
for n = 0, 1, and 2, where addition is defined in the “usual way.” Then prove that Zo[x] is a 
subgroup of Z,[x] and Z,[x] is a subgroup of Z,[x]. What if we replace “all polynomials of 
degree at most n” with “all polynomials of degree n?” 


LEVEL 4 


12: 


13. 


14. 


Let N be a normal subgroup of a group G. For each g € G, let gN = {gx | x € N}. Prove that 
gN = MN if and only if gh" + E€ N. Let G/N = {gN | g E G}. Prove that (G/N, °) is a group, 
where o is defined by gN ° AN = (gh)N. 


Let I be an ideal of a ring R. For each x ER, let x +I ={x+z|z €I}. Prove that 
x+I=y+1 if and only ifx—y E€ I. Let R/I = {x + I | x E€ R}. Prove that (R/I, +, -) isa 
ring, where addition and multiplication are defined by (x +I) + (y +1) = (x+y) +I and 
x +DO +I) =xy+]. 


Let Zn = {[k] | k € Z}, where [k] is the equivalence class of k under the equivalence =,,. Prove 
that (Z,,+, :¿) is a ring, where addition and multiplication are defined by [x] + [y] = [x + y] 
and [xy] = [x] - [y]. Then prove that Z/nZ = Z,,. Find the ideals of Z/15Z and Z,, and show 
that there is a natural one-to-one correspondence between them. 


LEVEL 5 


15. 


16. 


Ly: 


18. 


Let Z[x] = fa,x* + appx 1 +++ ax +a |k ENA ap, ay, ag E Z}. (Z[x],+, -) with 
addition and multiplication defined in the “usual way” is called the polynomial ring over Z. 
Prove that (Z[x],+, -) is a ring. Then prove that (Z,,[x],+, -) is not a subring of (Z[x], +, -) 
for any n E N. Let R[x] = {a,x* + ay_yx* 1 +--+ aix + ao | k ENA ao, d1, ...,ag E R} for 
an arbitrary ring R. Is (R[x],+, -) necessarily a ring? 


Let N be a normal subgroup of the group G, and define f: G > G/N by f(g) = gN. Prove that 
f is a surjective group homomorphism with kernel N. Conversely, prove that if f:G > H isa 
group homomorphism, then G/ ker(f) = f[G]. 


Let I be an ideal of a ring R, and define f: R > R/I by f(x) = x + I. Prove that f is a surjective 
ring homomorphism with kernel J. Conversely, prove that if f: R > S is a ring homomorphism, 
then R/ ker(f) = f[R]. 


Prove that (ER, +, -) is a ring, where addition and multiplication are defined pointwise. Then 


prove that for each x € R, I, = {f € PR | f(x) = 0} is an ideal of BR and the only ideal of FR 
containing I, and not equal to I, is BR. 
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LESSON 12 - NUMBER THEORY 
PRIMES, GCD, AND LCM 


Prime Numbers 


Recall that an integer a is divisible by an integer k, written k|a, if there is another integer b such that 
a = kb. We also say that k is a factor of a, k is a divisor of a, k divides a, or a is a multiple of k. For 
example, 7|21 because 21 = 7 - 3. Also, see Examples 4.3 and 4.4 from Lesson 4. 


Notes: (1) Every integer is divisible by 1. Indeed, ifn E Z, thenn =1-n. 

(2) Every integer is divisible by itself. Indeed, ifn E Z, thenn=n- 1. 

(3) It follows from Notes 1 and 2 above that every integer greater than 1 has at least 2 factors. 
A prime number is a natural number with exactly two positive integer factors. 


Notes: (1) An equivalent definition of a prime number is the following: A prime number is an integer 
greater than 1 that is divisible only by 1 and itself. 


(2) An integer greater than 1 that is not prime is called composite. 


Example 12.1: 


1. 0 is not prime because every positive integer is a factor of 0. Indeed, ifn E€ Z*, then 0 =n.: 0, 
so that n|0. 


2. 1is not prime because it has only one positive integer factor: if 1 = kb with b > 0, thenk = 1 
and b = 1. 


3. The first ten prime numbers are 2, 3, 5,7, 11, 13, 17, 19, 23, and 29. 


4. 4is not prime because 4 = 2 - 2. In fact, the only even prime number is 2 because by definition, 
an even integer has 2 as a factor. 


5. 9 is the first odd integer greater than 1 that is not prime. Indeed, 3, 5, and 7 are prime, but 9 is 
not because 9 = 3-3. 


6. The first ten composite numbers are 4, 6,8,9, 10,12, 14, 15,16, and 18. 


Two very important facts about prime numbers (that we will prove in this Lesson) are the following. 
1. There are infinitely many prime numbers. 
2. Every integer greater than 1 can be written uniquely as a product of prime numbers, up to the 


order in which the factors are written. 


The second fact is known as The Fundamental Theorem of Arithmetic. It is used often in many 
branches of mathematics. 
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When we write an integer n as a product of other integers, we call that product a factorization of n. If 
all the factors in the product are prime, we call the product a prime factorization of n. 


Example 12.2: 


1. 20 = 4- 5 is a factorization of 20. This is not a prime factorization of 20 because 4 is not prime. 
20 = 2 - 10 is another factorization of 20. This example shows that factorizations in general are 
not unique. 


2. An example of a prime factorization of 20 is 20 = 2-2-5. We can also write this prime 
factorization as 2-5-2 or5-2- 2. So, you can see that if we consider different orderings of 
the factors as different factorizations, then prime factorizations are not unique. This is why we 
say that prime factorizations are unique, up to the order in which the factors are written. 


3. A prime number is equal to its own prime factorization. In other words, we consider a prime 
number to be a product of primes with just one factor in the product. For example, the prime 
factorization of 2 is 2. 


Recall from Lesson 4 that the Well Ordering Principle says that every nonempty subset of natural 
numbers has a least element. 


We will now use the Well Ordering Principle to prove half of the Fundamental Theorem of Arithmetic. 
Theorem 12.1: Every integer greater than 1 can be written as a product of prime numbers. 


Note that we left out the word “uniquely” here. The uniqueness is the second half of the Fundamental 
Theorem of Arithmetic, which we will prove later in this lesson. 


Analysis: We will prove this theorem by contradiction using the Well Ordering Principle. The idea is 
simple. If an integer n greater than 1 is not prime, then it can be factored as kr with 1 < k < n and 
1<r<vn. If k andr can be written as a product of primes, then so can n because n is simply the 
product of all the factors of k and r. For example, 6 = 2-3 and 20 = 2-2-5. Therefore, we have 
120 = 6-20 = (2 - 3) - (2-2-5). Let’s write the proof. 


Proof of Theorem 12.1: Suppose toward contradiction that there exists an integer greater than 1 that 
cannot be written as a product of prime numbers. By the Well Ordering Principle, there is a least such 
integer, let’s call it n. Since n cannot be written as a product of prime numbers, then in particular, n is 
not prime. So, we can write n = kr with kr € N and1<k <nand1<r <n. Since n is the least 
integer greater than 1 that cannot be written as a product of prime numbers, k and r can both be 
written as products of prime numbers. But then n= kr is also a product of prime numbers, 
contradicting our choice of n. This contradiction shows that every integer greater than 1 can be written 
as a product of prime numbers. oO 


Notes: (1) Recall that a proof by contradiction works as follows: 
1. We assume the negation of what we are trying to prove. 
2. We usea logically valid argument to derive a statement which is false. 


3. Since the argument was logically valid, the only possible error is our original assumption. 
Therefore, the negation of our original assumption must be true. 
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The negation of the statement “Every integer greater than 1 can be written as a product of prime 
numbers” is “There is an integer greater than 1 that cannot be written as a product of prime numbers.” 
If we let S = {k € N | k >1Ak cannot be written as a product of prime numbers}, then by our 
assumption, S # @. It follows from the Well Ordering Principle that S has a least element, which in the 
proof above, we name n. 


The argument then proceeds to factor n as kr, where k and r are both greater than 1 and less than n. 
We can factor n this way because n in not prime. 


Since n is the least element of S, it follows that k andr are not in S. Therefore, k and r can be written 
as a product of prime numbers. But this immediately gives us a prime factorization of n, contradicting 
our original assumption. 


Since every step of our argument was logically valid, the only thing that could have been wrong was 
our original assumption. So, every integer greater than 1 can be written as a product of prime numbers. 


(2) In general, if P(x) is a property, then the negation of vx(P(x)) is ax(~P(x)). In other words, 
when we pass a negation symbol through a universal quantifier, the quantifier changes to an existential 
quantifier. So, Avx(P(x)) = Ax(AP(x)), where = is pronounced “is logically equivalent to.” For 
Theorem 12.1, the property P(x) is q(x) > r(x), where q(x) is “x > 1” and r(x) is “x can be written 
as a product of prime numbers.” Recall from part 2 of Example 9.5 in Lesson 9 that «(q(x) > r(x)) is 
logically equivalent to q(x) A ar (x). So 4x(AP(x)) says, “There is an integer x such that x > 1 andx 
cannot be written as a product of prime numbers.” 


In general (although not needed here), we also have 44x(P(x)) = Vx(-P(x)). 
Corollary 12.2: Every integer greater than 1 has a prime factor. 


Proof: Let n be an integer greater than 1. By Theorem 12.1, n can be written as a product of prime 
numbers. Let p be any of the prime numbers in that product. Then p is a prime factor of n. oO 


Theorem 12.3: There are infinitely many primes. 


Analysis: Starting with a prime number p > 1, we want to find a prime number greater than p. This 
will prove that there infinitely many prime numbers, because if P is a finite set of prime numbers, then 
the previous statement implies that we can find a prime number greater than the biggest number in 
the set P. 


Now recall that if n is a positive integer, then the number n! (pronounced “n factorial”) is defined by 
n! = 1-2-:-n. Forexample, 3!=1-2-3=6and4!=1-2-3-4= 24. 


If n > 2, then n! is a number larger than n that is divisible by every positive integer less than or equal 
to n. For example, 3! = 6 is divisible by 1, 2, and 3, and 4! = 24 is divisible by 1, 2, 3, and 4. 


Now, ^n! Is certainly not prime. In fact, it has lots of factors! For example,4! = 24 has 8 factors (what 
are they?). Therefore, n! itself won’t work for us. So, we add 1 to this number to get the number 
M=ni+1. 
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By adding 1 to n! to produce M, we have destroyed almost all the divisibility that we had. Specifically, 
M is not divisible by any integer k with 1 < k < n. To see this, let k be an integer satisfying 1 < k <n. 
We know that there is an integer r such that n! = kr (because n! Is divisible by k). If M were divisible 
by k, then there would be an integer s such that M = ks. But then, by subtracting n! from each side of 
the equation M =n!+1, we get 1=M—n!=ks—kr=k(s—r). Since k > 1 and s—r is an 
integer, this is impossible! Therefore, M is not divisible by k. 


It would be nice if we could prove that M is prime. Then M would be a prime number greater than n, 
thus completing the proof. Sometimes M does turn out to be prime. For example, if n = 2, then 
M = 2!+12=2+1 = 3, whichis prime. However, it is unfortunate for us that M is not always prime. 
In Problem 6 below you will find values for n for which M is not prime. 


However, even if M is not prime, all is not lost. By Corollary 12.2, we know that M has a prime factor, 
let’s call it p. We also know that M is not divisible by any integer k with 1 < k < n. It follows that p is 
a prime number greater than n. 


| think we’re ready to write out the proof. 


Proof of Theorem 12.3: Let P be a finite set of prime numbers with greatest member q and let 
M = q! + 1. By Corollary 12.2, M has a prime factor p. So, there is an integer k such that M = pk. 


We show that p > q. 


Suppose toward contradiction that p < q. Then plq!. So, there is an integer r such that q! = pr. It 
follows that 1 = M — q! = pk — pr = p(k — r). So, p = 1, which contradicts that p is prime. 


It follows that p > q and so, p is greater than every prime number in P. Since P was an arbitrary finite 
set of prime numbers, we have shown that there are infinitely many prime numbers. oO 


The Division Algorithm 


In Lesson 4 (Example 4.7 and the notes following), we showed that every integer is even or odd, and 
never both. In other words, if n € Z, there are unique integers k and r such that n = 2k + r, where 
r = Oorr = 1. We sometimes say, “When n is divided by 2, k is the quotient and r is the remainder.” 
Observe that when an integer n is divided by 2, the quotient can be any integer, but the remainder can 
be only 0 or 1. 


Example 12.3: 


1. When 11 is divided by 2, the quotient is 5 and the remainder is 1. That is, 11 = 2-541. 


2. When 20 is divided by 2, the quotient is 10 and the remainder is 0. That is, 20 = 2 - 10 + 0, or 
equivalently, 20 = 2 - 10. Notice that in this case, 20 is divisible by 2. 


3. When - 11 is divided by 2, the quotient is - 6 and the remainder is 1. That is- 11 = 2(-6) +1. 
Compare this to the first example. Based on that example, most students would probably guess 
that the quotient here would turn out to be - 5. But as you can see, that is not the case. 
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The Division Algorithm generalizes the notion of an integer n being “even or odd” (2k or 2k + 1) ton 
being equalto mk +r, whereO <r<m. 


For example, for m = 3, the Division Algorithm will tell us that every integer can be written uniquely 
in one of the three forms 3k, 3k + 1, or 3k + 2. Observe that when an integer n is divided by 3, the 
quotient can be any integer, but the remainder can be only 0,1, or 2. 


As one more example, for m = 4, the Division Algorithm will tell us that every integer can be written 
uniquely in one of the four forms 4k, 4k + 1, 4k + 2, or 4k + 3. Observe that when an integer n is 
divided by 4, the quotient can be any integer, but the remainder can be only 0, 1, 2, or 3. 


Example 12.4: 


1. When 14 is divided by 3, the quotient is 4 and the remainder is 2. That is, 14 = 3 -4 + 2. 


2. When 36 is divided by 4, the quotient is 9 and the remainder is 0. That is, 36 = 4-9 + 0, or 
equivalently, 36 = 4 - 9. Notice that in this case, 36 is divisible by 4. 


3. When 17 is divided by 5, the quotient is 3 and the remainder is 2. That is, 17 = 5-3 +2. 
4. When - 17 is divided by 5, the quotient is - 4 and the remainder is 3. That is- 17 = 5(-4) + 3. 


Theorem 12.4 (The Division Algorithm): Let n and m be integers with m > 0. Then there are unique 
integers k andr such thatn = mk +rwithO<r<m. 


Many students find the standard proof of the Division Algorithm to be quite hard to follow. | know that 
when I read the proof for the first time, | found it quite confusing. To better understand the argument, 
let’s first run a couple of simulations using specific examples that mimic the proof. 


Simulation 1: Let’s let n = 7 and m = 2. With these choices for n and m, the Division Algorithm says 
that there are unique integers k and r such that 7 = 2k +r and 0 <r < 2 (in other words, r = 0 or 
r=); 


Let’s look at the equation 7 = 2k + r in the form 7 — 2k = r. In particular, let’s look at the possible 
values of 7 — 2k as k ranges over all possible integers. Let’s do this by matching up each integer k with 
the corresponding value of 7 — 2k: 


| x |= |-4|-3|-2|-1|o0] 
r7? |7-2k | = ]15]/13}11] 9] 7] 5 ] 3 | 
Observe that the top row is simply “listing” all the integers. The “---” to the left of - 4 and to the right 


of 4 are there to indicate that this list keeps going infinitely in each direction. However, | did make sure 
to include the most important values in the visible part of our list. 


1| 2 a 4] 
5 | 3 E 


We get each value in the bottom row by substituting the value above it for k in the expression 7 — 2k. 
For example, for k = - 4, we have 7 — 2k = 7 — 2( -4) =7 +8 = 15. 
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Notice that the values in the bottom row decrease by 2 units for each 1 unit increase in k. This is 
because m = 2. 


We highlighted the column where k = 3 and r = 7 — 2k = 1. This is the column where the smallest 
nonnegative number appears in the bottom row. In other words, we let r be the least positive value of 
7 — 2t, as t ranges over all the integers, and we let k be the corresponding t-value. 


In general, how do we know that these values exist? , 


Well, since n > 0 (n = 7 in this example), the expression n — mt = 0 when t = 0. It follows that the 
set {n —mt|t € ZAn—mt = 0} = {7 — 2t|t € ZAN7 — 2t = 0} is not empty (7 — 2-0 =7 is in 
this set). So, we can invoke the Well Ordering Principle to get a least element r. In this simulation, r 
will turn out to be 1 with a corresponding k-value of 3. (We will see what happens if n < 0 in the next 
simulation). 


By taking r to be the least element from a set of natural numbers, we know that 7 will be nonnegative. 
But how do we know that r will be less than 2? We use the fact that the bottom row decreases by 2 
units for each 1 unit increase in the top row. 


Suppose we accidentally chose r = 3. Then we have 7 — 2k = 3. If we subtract 2 from each side of 
this equation, we get 7 — 2k — 2 = 1. Using distributivity, we have that 7 — 2k — 2 is equal to 
7 —2(k + 1). So, 7 — 2(k + 1) = 1. Looks like we chose the wrong value for r. What we just showed 
is that if we increase k by 1 (from 2 to 3), we decrease r by 2 (from 3 to 1). 


In general, if r > 2, then we have n — 2k > 2, so that n — 2k — 2 > 0. Thus, n — 2(k +1) > 0. But 
n—2(k+1)=n-2k—2<n-—2k. This contradicts that r was the least possible value of n — 2t 
with n — 2t = 0. It follows that r < 2. 


Now let’s check uniqueness. So, we have 7 = 2-3 + 1. How do we know that there aren’t two other 
numbers k’ andr’ with 0 <r’ < 1 such that 7 = 2k’ + r'? 


Well, if there were, then we would have 2-3 +1 = 2k' + r'. Subtracting 2k’ from each side of the 
equation and subtracting 1 from each side of the equation gives us 2-3 — 2k’ = r' — 1. We now use 
the distributive property on the left to get 2(3 — k') = r' — 1. This equation shows that 2 is a factor 
ofr’ — 1. r' can’t be 0 because 2 is not a factor of - 1. Therefore, r' = 1 (remember that 0 and 1 are 
the only two choices for r’). So, 2(3 — k’) = 0, and therefore, 3 — k’ = 0. So, k' = 3. Oh, look at that! 
r' and k’ are the same asr and k. 


So, we just proved that there is exactly one way to write 7 in the form 2k + r with k and r integers 
and 0 <r < 2. We showed that 7 = 2-3 + 1 is the only way to do it. 


Simulation 2: This time, let’s let n = -4 and m = 3. With these choices for n and m, the Division 
Algorithm says that there are unique integers k and r such that -4 = 3k +r and 0 <r < 3 (in other 
words, r = 0,r = 1, orr = 2). 


Let’s look at the equation - 4 = 3k + r in the form - 4 — 3k = r, and as we did in Simulation 1, let’s 
match up each integer k with the corresponding value of - 4 — 3k: 
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| ok | |-4|-3 [Fa] -1] of a] 2| 3| 4 
r? |-4-3k| =- | 8] 5| 2 | 


This time, since m = 3, the values in the bottom row decrease by 3 units for each 1 unit increase in k. 


We highlighted the column where k = - 2 andr = -4 — 3(-2) = -4 + 6 = 2 because it is the column 
where the smallest nonnegative number appears in the bottom row. This time 2 is the smallest possible 
value of r, and this r-value corresponds to a k-value of - 2. 


Since n < 0 this time (n =-4 in this example), setting t = 0 in the expression n — mt does not 
produce a nonnegative value. This time, we let t = n to get n — m- n (specifically, for this simulation 
we set t = -4 to get -4 — 3(- 4) =-4+412 = 8, which is greater than 0). It follows that the set 
{n-—mt|t€ZAn—mt => 0} = {-4-3t|t € ZA-4— 3t = 0} is not empty. So, once again, we 
can invoke the Well Ordering Principle to get a least element r. In this simulation, 7 will turn out to be 
2 with a corresponding k-value of - 2. 


As in Simulation 1, it is clear that r > 0, and we use the fact that the bottom row decreases by 3 units 
for each 1 unit increase in the top row to show that r < 3. 


Suppose we accidentally chose r = 5. Then we have - 4 — 3k = 5. If we subtract 3 from each side of 
this equation, we get - 4 — 3k — 3 = 2. But using distributivity, we have that - 4 — 3k — 3 is equal to 
-4 — 3(k + 1). So, -4 — 3(k + 1) = 2. We just showed is that if we increase k by 1 (from - 3 to - 2), 
we decrease r by 3 (from 5 to 2). 


In general, if r > 3, then we have n — 3k > 3, so that n — 3k — 3 = 0. Thus, n — 3(k + 1) = 0. But 
n— 3(k +1) =n — 3k — 3 < n — 3k. This contradicts that r was the least possible value of n — 3t 
with n — 3t = 0. It follows that r < 3. 


| leave it as an exercise for the reader to check uniqueness for this special case. 
Let’s move on to the proof of the Theorem. 


Proof of Theorem 12.4: Let n,m E Z with m > 0, and let S = {n — mt |t E ZAn—mt = 0}. To see 
that S + @,we consider two cases. If n > 0, then let t = 0, and we have n—mt=neES. lfn <0, 
then let t = n, so that we have n — mt =n—mn=n(1—™M). Since m = 1, we have 1 —m < 0. It 
follows that n(1 — m) = 0, and so, n — mt E S. In both cases, we have shown that S + Ø. 


Since S is anonempty subset of natural numbers, by the Well Ordering Principle, S has a least element 
r = n — mk, where k E Z. Since S CN, r = 0. By adding mk to each side of the equation, we have 
n=mk +r. 


We need to show that r < m. Suppose toward contradiction that r > m. Substituting n — mk for r 
gives us n — mk > m. Subtracting m from each side of this last equation gives (n — mk) — m > 0. 
Now, sincem > 0,r >r—m= (n — mk) —m.But(n— mk) —-m=n-—mk-—m=n-—m(k +1), 
and so, (n — mk) — m is an element of S smaller than r, contradicting r being the least element of S. 
This contradiction tells us that we must have r < m. 
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We still need to prove that k and r are unique. Suppose that n = mk, +17, and n = mk, + r, with 
both 0 < r < mand0 < r, < m. Without loss of generality, we may assume that n 2 r. 


By a simple substitution, mk, + rı = mk, + r2. Subtracting mk, from each side of the equation and 
simultaneously subtracting rı from each side of the equation, we get mk, — mk, = n, — r;. Factoring 
m on the left gives m(k, — kz) = rz — 11, and we see that m|r, — rı. 


Since r = rı, we have rz — 7; = 0. Since we have r; = 0 andr, < m, we haver -r <m—O=™M. 
So, m|r, — rı and O0 <r, — rı <m. It follows that r, — rı = 0. So, r, = rı. Finally, r) = r} and 
mk, +rı = mk, + rů, together imply that mk, = mk,, and so, k4 = kp. oO 


GCD and LCM 


Let a and b be two integers. An integer j is a common divisor (or common factor) of a and b if j is a 
factor of both a and b. An integer k is a common multiple of a and b if k is a multiple of both a and b. 


Example 12.5: Let a = 6 and b = 15. The positive divisors of a are 1, 2, 3, and 6. The positive divisors 
of b are 1,3, 5, and 15. Therefore, the positive common divisors of a and b are 1 and 3. 


For each positive divisor there is a corresponding negative divisor. So, a complete list of the divisors of 
a are 1, 2,3,6,-1,-2,-3, and -6 and a complete list of the divisors of b are 1,3,5,15,-1,-3,-5, 
and - 15. Therefore, a complete list of the common divisors of a and b are 1,3,-1, and - 3. 


If both a and -a are in a list, we will sometimes use the notation +a instead of listing a and -a 
separately. In this example, we can say that the complete list of common divisors of a and b is +1, +3. 


The multiples of a are +6,+12,+18, +24,+30,+36,... and so on. The multiples of 15 are 
+15,+30,+45,+60,... and so on. Therefore, the common multiples of a and b are 
+30, +60, +90, +120, ... and so on. 


Again, let a and b be distinct integers. The greatest common divisor (or greatest common factor) of a 
and b, written gcd(a, b), is the largest common divisor of a and b. The least common multiple of a 
and b, written lcm (a, b), is the smallest positive common multiple of a and b. 


Example 12.6: 
1. From Example 12.5, it’s easy to see that gcd(6, 15) = 3 and Icm(6, 15) = 30. 


2. gcd(2,3) = 1 and Icm(2,3) = 6. More generally, if p and q are prime numbers with p £ q, 
then gcd(p, q) = 1 and lcm(p, q) = pq. 


3. gcd(4,15) = 1 and Icm(4, 15) = 60. Observe that neither 4 nor 15 is prime, and yet their gcd 
is 1 and their Icm is the product of 4 and 15. This is because 4 and 15 have no common factors 
except for 1 and - 1. We say that 4 and 15 are relatively prime. 


Note that if p and q are prime numbers with p + q, then p and q are relatively prime. 


We have the following more general result: if a and b are relatively prime integers, then 
gcd(a, b) = 1 and Icm(a, b) = ab (see Theorem 12.10 below). 
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We can extend all these ideas to larger sets of numbers. Specifically, let X be a finite set of integers 
containing at least one nonzero integer. Then the greatest common divisor of the integers in X, written 
gcd(X) (or gcd(a, az, ..., an), where X = {a4, Q3, ..., An}) is the largest integer that divides every 
integer in the set X, and the least common multiple of the integers in X, written Icm(X) (or 
lcm(a4, a2, ..., An )) is the smallest positive integer that each integer in the set X divides. 


For convenience, if X contains only 0, we define gcd(X) = 0. 


Also, the integers in the set X are said to be mutually relatively prime if gcd(X) = 1. The integers in 
the set X are said to be pairwise relatively prime if for each pair a,b € X witha # b, gcd(a,b) = 1. 


Example 12.7: 
1. gcd(10, 15,35) = 5 and Icm(10, 15, 35) = 210. 


2. gcd(2,3,12) = 1 and Icm(2, 3,12) = 12. Notice that here 2,3, and 12 are mutually relatively 
prime, but not pairwise relatively prime because for example, gcd(2, 12) = 2 # 1. 


3. gcd(10, 21,143) = 1 and Icm(10, 21, 143) = 30,030. In this case, we have 10, 21, and 143 
are pairwise relatively prime. 


We have the following result: if X = {a, dz, ..., an} is a set of pairwise relatively prime integers, 
then gcd(X) = 1 and Icm(X) = a, a2 +: ay. The proof of this is left as an optional exercise for 
the reader. Also note that pairwise relatively prime implies mutually relatively prime. 


4. Fora set X with just one element a, gcd(a) = a and lcm(a) = a. In particular, gcd(0) = 0 and 
Icm(0) = 0. 


Let a, b € Z. A linear combination of a and b is an expression of the form ma + nb with m,n € Z. We 
call the integers m and n weights. 


Example 12.8: 


1. Since 5-10 —2-15 = 50 — 30 = 20, we see that 20 is a linear combination of 10 and 15. 
When we write 20 as 5-10 — 2-15, the weights are 5 and - 2. 


This is not the only way to write 20 as a linear combination of 10 and 15. For example, we also 
have -1-10+ 2-15 =-10+ 30 = 20. When we write 20 as -1-10+2-15, the weights 
are -1and2. 


2. Any number that is a multiple of either 10 or 15 is a linear combination of 10 and 15 because 
we can allow weights to be 0. For example, 80 is a linear combination of 10 and 15 because 
80 =8-10+0-15. 


Also, 45 is a linear combination of 10 and 15 because 45 = 0-10+3-15. 


3. We will see in Theorem 12.5 below that gcd(a, b) can always be written as a linear combination 
of a and b. For example, gcd(10, 15) = 5, and we have 5 =-1-10+1- 15. 


4. Using the same theorem mentioned in 3, if a and b are relatively prime, then 1 can be written 
as a linear combination of a and b. For example, 4 and 15 are relatively prime and we have 
1=4-4-1-15. 
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Theorem 12.5: Let a and b be integers, at least one of which is not 0. Then gcd(a, b) is the least positive 
integer k such that there exist m,n E Z with k = ma + nb. 


This theorem says two things. First, it says that gcd(a, b) can be written as a linear combination of a 
and b. Second, it says that any positive integer smaller than gcd(a, b) cannot be written as a linear 
combination of a and b. 


Proof: We first prove the theorem for a,b € Z*. So, let a, b be positive integers and let S be the set of 
all positive linear combinations of a and b with weights in Z. 
S={ma+nb|mn€ZAma+nb > 0} 


Notice that a,b E S because a = 1a + Ob and b = Oa + 1b. In particular, S + Ø. By the Well Ordering 
Principle, S has a least element k. By the definition of S, there exist m,n E Z with k = ma + nb. 


By the Division Algorithm, there are s,r € Z wi = rand0<r<k. 


So,r =a — kS = a — (ma + nb)s = a — mas — nbs = (1 —ms)a — (ns)b. We see that r is a linear 
combination of a and b. Sincer < k andr is a linear combination of a and b, r cannot be in S (because 
k is the least element of S). So, r must be 0. It follows that a = ks. Therefore, kla. 


Replacing a by b in the last two paragraphs shows that k|b as well. So, k is acommon divisor of a and 
b. Now, if c is another common divisor of a and b, then by Problem 7 from Problem Set 4 in Lesson 4, 
c is a divisor of any linear combination of a and b. Since k is a linear combination of a and b, cis a 
divisor of k. Since every common divisor of a and b is also a divisor of k, it follows that k = gcd(a, b). 


Since ma = (-m)(-a) and nb = (-n)(-b), the result holds whenever a and b are both nonzero. 


Finally, suppose a = 0 or b = 0. Without loss of generality, let a = 0. Then b + 0. So, gcd(a, b) = b 
(or -b if b <0). We also have for any m,n E Z, ma+nb=m-0+nb = nb. The least positive 
integer of the form nb is 1- b = b (or -1- b if b < 0). So, the result holds in this case as well. oO 


We’re almost ready to finish proving the Fundamental Theorem of Arithmetic. We will first prove two 
preliminary results that will make the proof easier. 


Theorem 12.6: Let a,b,c € Z* with a and b relatively prime and abc. Then alc. 


Proof: Let a,b,c € Z* with a and b relatively prime and let a|bc. Since gcd(a, b) = 1, by Theorem 
12.5, there are integers m and n with 1 = ma + nb. Since al|bc, there is an integer k such that 
bc = ak. Multiplying each side of the equation 1 = ma + nb by c and using the distributive property, 
c =c(ma+nb) = cma + cnb = cma + nbc = cma + nak = a(cm + nk). Since c,m,n,k E€ Z and 
Z is closed under addition and multiplication, cm + nk € Z. Therefore, a|c. oO 


Theorem 12.7: Let p be prime and let a4, a3, ..., an be positive integers such that p|a,az °::a,. Then 
there is an integer j with 1 < j < n such that p|qj. 


Proof: We will prove this theorem by induction on n = 1. 
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Base Case (n = 1): We are given that p is prime, a, E Zt, and p|a,. Wait a sec... p |a; is the conclusion 
we were looking for. So, the theorem holds for n = 1. 


Inductive Step: Let k € N and assume that the result holds for n = k. 


Let p be prime and let a4, dz, ... Ax, Ax41 be positive integers such that p|a,az °°: akag+1. Since p is 
prime, its only positive factors are 1 and p. Therefore, gcd(p, aa, +: ag) is either 1 or p. 


If gcd(p, a, Qz +: Ax) = 1, then by Theorem 12.6, p|ax+4. If gcd(p, a, az -*: ag) = p, then pla, az ++: ax, 
and by our inductive assumption, there is an integer j with 1 < j < k such that p|ax. 


Therefore, the result holds forn = k + 1. 
By the Principle of Mathematical Induction, the result holds for alln E€ N withn = 1. oO 
We are finally ready to finish the proof of the Fundamental Theorem of Arithmetic. 


Theorem 12.8 (The Fundamental Theorem of Arithmetic): Every integer greater than 1 can be written 
uniquely as a product of prime numbers, up to the order in which the factors are written. 


Proof: By Theorem 12.1, every integer greater than 1 can be written as a product of prime numbers. 
We need to show that any two such prime factorizations are equal. Assume toward contradiction that 
n can be written in the following two different ways: n = pip2°** Pk = q1q2 t qr, where 
P1,D 20 ++» Pk» V1 42» ++) Qr are prime numbers. Without loss of generality, assume pı < pz < + < Pk 
and qı < q2 <: < q,. Also, by cancelling common primes on the left with common primes on the 
right, we may assume that for alli < k and j < r, pi # qj. Suppose 1 < i < k. Then p;|p,p2 -- Dx. 
Since pipz Pk = 9192 °** qr, We have p;|q192 °°: qr. By Theorem 12.7, there is j with 1 < j < r such 
that p;|q;. This is a contradiction. So, there cannot exist two different prime factorizations of n. oO 


Since prime factorizations are unique only up to the order in which the factors are written, there can 
be many ways to write a prime factorization. For example, 10 can be written as 2-5 or 5- 2. To make 
things as simple as possible we always agree to use the canonical representation (or canonical form). 
The word “canonical” is just a fancy name for “natural,” and the most natural way to write a prime 
factorization is in increasing order of primes. So, the canonical representation of 10 is 2: 5. 


As another example, the canonical representation of 18 is 2 - 3 - 3. We can tidy this up a bit by rewriting 
3 - 3 as 37. So, the canonical representation of 18 is 2 - 37. 


If you are new to factoring, you may find it helpful to draw a factor tree. 


For example, here is a factor tree for 18: 
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To draw this tree, we started by writing 18 as the product 2 - 9. We put a box around 2 because 2 is 
prime and does not need to be factored any more. We then proceeded to factor 9 as 3 - 3. We puta 
box around each 3 because 3 is prime. We now see that we are done, and the prime factorization can 
be found by multiplying all the boxed numbers together. Remember that we will usually want the 
canonical representation, and so, we write the final product in increasing order of primes. 


By the Fundamental Theorem of Arithmetic above it does not matter how we factor the number—we 
will always get the same canonical form. For example, here is a different factor tree for 18: 


18 
ZN 


[3] 6 
Z/N 


Now, to prove that a positive integer n is composite, we simply need to produce a factor of n that is 
different from 1 and n itself. This may sound easy, but in practice, as we look at larger and larger values 
of n it can become very difficult to find factors of n. For example, the largest prime number that we 
are currently aware of (at the time | am writing this book) is 277232917 — 1. This is an enormous number 
with 23,249,425 digits. By Theorem 12.3, we know that there are prime numbers larger than this, but 
we have not yet found one. 


The following theorem provides a couple of tricks to help us (or a computer) determine if a positive 
integer is prime more quickly. 


Theorem 12.9: If n is composite, then n has a prime factor p < Vn. 


Proof: Let n be composite, so that there are integers a, b with 1 < a,b < n andn = ab. If both a and 
b are greater than Vn, then we would have n = ab > Vn: Vn = n, a contradiction. So, either a < Vn 
or b < Vn. Without loss of generality, suppose that a < Yn. By Corollary 12.2, a has a prime factor p. 
Since p is a factor of a and a is a factor n, it follows that p is a factor of n. Also, since p is a factor of a 


anda < Vn, we have p < Vn. o 


Example 12.9: 


1. Let’s determine if 187 is prime or composite. Since v187 < V196 = 14, by Theorem 12.9, we 
need only check to see if 187 is divisible by 2, 3, 5, 7,11, and 13. Checking each of these, we see 
that 187 = 11 - 17. So, 187 is composite. 


2. Let’s determine if 359 is prime or composite. Since V359 < V361 = 19, by Theorem 12.9, we 
need only check to see if 359 is divisible by 2, 3,5, 7,11 13, and 17. A quick check shows that 
359 is not divisible by any of these numbers, and so, 359 is prime. 


Sometimes in a prime factorization we will want to make sure that we do not “skip” any primes, and 
that each prime has a power. 
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For example, the canonical representation of 50 is 2 - 5%. Note that we “skipped over” the prime 3 and 
there is no exponent written for 2. We can easily give 2 an exponent by rewriting it as 2, and since 
x° = 1 for any nonzero x (by definition), we can write 1 = 3°. Therefore, the prime factorization of 
50 can be written as 21 - 3° - 57. 


This convention can be especially useful when comparing two or more positive integers or performing 
A : : ao, ay An: $ ; . 
an operation on two or more integers. We will say that py pi” t Pn” is a complete prime factorization 


if Po, Pi» ++» Pn are the first n primes (po = 2, p4 = 3, and so on) and ap, a4,..., A, E N. 


Example 12.10: 


1. The prime factorization of 364 in canonical form is 27 - 7 - 13. However, this is not a complete 
factorization. 


A complete factorization of 364 is 22 -3°-5°-71-11° - 131. This is not the only complete 
factorization of 364. Another one is 27 -3°-5°- 71-119 - 131 - 17°. 

Given a complete factorization po pi! - pa” of a positive integer, po pi! + pa™pe, is another 
complete factorization, and in fact, for any k EN, po pi ++: Dn" De41 Pes Peak is also a 
complete factorization of that same positive integer. In words, we can include finitely many 
additional prime factors at the tail end of the original factorization all with exponent 0. Just be 


careful not to skip any primes! 


2. 29-35-59. 7%.11° - 13° - 177 and 23 - 31. 5° . 7° - 11° are complete prime factorizations. In 
many cases, it is useful to rewrite the second factorization as 23 - 3t - 5° . 7°-11°-13°-17°. 
This is also a complete prime factorization. However, this one has all the same prime factors as 
the first number given. 


Complete prime factorizations give us an easy way to compute greatest common divisors and least 
common multiples of positive integers. 

Suppose that a = p,°p; =- pp” and b = pop," : 
Then we have 


“ p?" are complete prime factorizations of a and b. 


gcd(a, b) = pminiao.bo},miniar bi} a. pmintan ba} max{aoboh maxa, bı} m pmaxtan dn), 


Icm(a, b) = po 
Example 12.11: Let a = 2-5*-7 and b = 3 -5 - 11?. We can rewrite a and b with the following 
complete prime factorizations: a = 21- 3° . 52 -71 -11° and b = 2°-31-51-7°- 112. From these 
factorizations, it is easy to compute gcd(a, b) and Icm(a, b). 

gcd(a, b) = 2° -3° -5t . 7° -11° = 5 and Icm(a, b) = 2? - 3t - 5? . 7t - 11? = 127,050. 

Observe that in this example, ab = 350 - 1815 = 635,250 = 5 - 127,050 = gcd(a, b) - lcm(a, b). 
We will now show that the equation ab = gcd(a, b) - lcm(a, b) is true for all positive integers a and b. 


Before we state and prove the theorem, note that min{x, y} + max{x, y} = x + y (check this!). 


Theorem 12.10: Let a, b E€ Z*. Then gcd(a, b) - lcm(a, b) = ab. 
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Proof: Let a = ae i ane pa” and b = pop," ae p?" be complete prime factorizations of a and b. Then 


gcd(a, b) - lcm(a, b) 


min{ao,bo} „min{a1,b1} min{an,bn} , ,max{do,bo}, max{aj,by} max{dn,bn} 
0 Py "Pn ‘Po Py "Pn 


min{ag,bo},, max{ag,bo},_min{ay,by}_,max{az,by}____-min{an,bn},, maxfan,bn} 
Po Po Py Py "Dn Pn 


_ _ min{dap,bo}+max{dag,bo}, min{a,,b;}+max{a,,b,} min{an,by}+max{an,by} 
= Po Py "Pn 


_ ,.aAgtbo, ay+b, Antbn 
= Po Py "Pn 


ao „bo, a1 


bı an, bn 
= Po Po Pı Pı Pn Pn 


ao, a1 bo bı 


an bn 
= Po P1 Pn `Po Pi Pn 
= ab oO 


We will finish this lesson with the Euclidean Algorithm. This is an algorithm for computing the gcd of 
two positive integers. It also provides a method for expressing the gcd as a linear combination of the 
two integers. 


Theorem 12.11 (The Euclidean Algorithm): Let a,b € Z* with a => b. Let r) = a, rı = b. Apply the 
division algorithm to r) and 7, to find k4, r2 E Z* such that rọ = riki + 7%, where 0 < r, < 17. If we 
iterate this process to get 7) = 1)41kj41 + 142, where 0 < %42 < 74, for j = 0,1,...,2—1 so that 
Tn+1 = 0. Then gcd(a, b) = m. 


You will be asked to prove the Euclidean Algorithm in Problem 12 below. 


Example 12.12: Let’s use the Euclidean Algorithm to find gcd(305, 1040). 
1040 = 305 -3 + 125 
305 = 125.2 +55 
125=55-2+4+15 


59 = 15*3+10 
15=10-1+5 
10=5-:2+0 


So, gcd(305, 1040) = 5. 


Notes: (1) In this example, we have a = rọ = 1040 and b = r, = 305. By the Division Algorithm we 
can write 1040 = 305k, + r2, where 0 < r, < 305. To find k4, we are simply looking for the largest 
integer k such that 305k < 1040. Well, 305-3 = 915 and 305-4 = 1220. So, 4 is too big and 
therefore, we let k, = 3. It follows that r, = 1040 — 305-3 = 1040 — 915 = 125. 


We now repeat the procedure using rı = 305 and 7, = 125 to get 305 = 125-2 + 55. Notice that 


125 -3 = 375, which is too big because 375 > 305. This is why we let kz = 2. It follows that 
rz = 305 —125-2 = 305 — 250 = 55. 
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Continuing this process, we eventually wind up with 10 = 5-2 + 0, so that r, = 0. By Theorem 12.11, 
gcd(305,1040) = rs = 5. 


(2) As we go through the algorithm, we get rọ = 1040,7, = 305,7, = 125,73 = 55,7% = 15, 
rs = 10,7, = 5, andr, = 0. 


We also get k, = 3, kọ = 2,k3 =2,k, = 3, ks = 1, and kę = 2. 


(3) We can now go backwards through the algorithm to express gcd(305,1040) as a linear 
combination of 305 and 1040. 


We start with the second to last line (line 5): 15 = 10 - 1 + 5. We solve this equation for 5 to get 
5 = 15— 1510. 


Working backwards, we next look atti -55 =15- 3 + 10. We solve this equation for 10 and then 
substitute into the previous equation: 10 = 55 — 15: 3. After substituting, we get 


5=15-1-10 = 15-1(55—15-3) 


We then distribute and group all the 15’s together and all the 55’s together. So, we have 
5=15-1-10=15-—1(55-15-3) =15-1-55+3-15=4-15-—1-55. 
Line 3 is next: 125 = 55-2 + 15. We solve this equation for 15 to get 15 = 125 — 2-55. And once 
again we now substitute into the previous equation to get 
5=4-15-—1-55 = 4(125 —2-55) -—1-55 =4-125-—8-55-1-55=4-125-—9.-55. 
Let’s go to line 2: 305 = 125 -2 + 55. We solve this equation for 55 to get 55 = 305 — 2-125. 
Substituting into the previous equation gives us 
5=4-125-—9-55 = 4-125 — 9(305 — 2-125) 
= 4.125 -— 9 -305 + 18-125 = 22-125 —9- 305. 
And finally line 1: 1040 = 305 - 3 + 125. Solving this equation for 125 gives us 125 = 1040 — 3-305. 
Substituting into the previous equation gives 
5 = 22-125 —9- 305 = 22(1040 — 3 - 305) —9- 305 
= 22-1040 — 66-305 —9- 305 = 22-1040 —75- 305. 
So, we see that gcd(305, 1040) = 5 = 22-1040 — 75-305 =-75- 305+ 22-1040. 
(4) With a little practice, the computations done in Note 3 can be done fairly quickly. Here is what the 
quicker computation might look like: 
5=15-1-10=15-1- (55-15 -3) =4-15—-1-55 = 4(125 —55-2)-1-55=4-125-9-55 
= 4. 125 — 9(305 — 125-2) = 22-125—9- 305 = 22(1040 — 305-3) —9- 305 = 22-1040 — 75-305 


So, 5 = gcd(305, 1040) =- 75-305 + 22 - 1040. 
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Problem Set 12 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 
LEVEL 1 


1. Write each of the following positive integers as a product of prime factors in canonical form: 
(i) 9 
(ii) 13 
(iii) 21 
(iv) 30 
(v) 44 
(vi) 693 
(vii) 67,500 
(vili) 384,659 
(ix) 9,699,690 


2. List all prime numbers less than 100. 


3. Find the gcd and lcm of each of the following sets of numbers: 
(i) {4,6} 
(ii) {12,180} 
(iii) {2,3,5} 
(iv) {14,21,77} 
(v) {720, 2448, 5400} 
(vi) {2175411°23, 25327411313} 


LEVEL 2 


4. Determine if each of the following numbers is prime: 

G) 101 

(ii) 399 

(iii) 1829 

(iv) 1933 

(v) 8051 

(vi) 13,873 

(vii) 65,623 
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5. Use the division algorithm to find the quotient and remainder when 723 is divided by 17. 


6. Forn E Z*, let M, =n! +1. Determine if M, is prime for n = 1, 2, 3,4, 5, 6, and 7. 


LEVEL 3 


7. Use the Euclidean Algorithm to find gcd(825, 2205). Then express gcd(825, 2205) as a linear 
combination of 825 and 2205. 


8. Prove that if k € Z with k > 1, then k? + 1 is not prime. 
9. Prove that gcd(a, b) | lcm(a, b). 
10. Let a, b,c € Z. Prove that gcd(a, b) = gcd(a + bc, b). 


11. Let a,b,k,r € Z with a = bk + r. Prove that gcd(a, b) = gcd(r, b). 


LEVEL 4 


12. Prove the Euclidean Algorithm: Let a,b € Z* witha > b. Let rọ = a, rı = b. Apply the division 
algorithm to rg and 7; to find k4, r2 € Z* such that rọ = riki + r2, where 0 < n, < 14. If we 
iterate this process to get r; = %41kj41 + Tj+2, where 0 < 142 < Tj+1 for j = 0,1, ..,n — 1 so 
that 7,4; = 0. Then gcd(a, b) = m. 


13. Prove that if a|c and b|c, then lcm(a, b) | c. 


14. Suppose that a, b € Z*, gcd(a, b) = 1, and cļ|ab. Prove that there are integers d and e such that 
c = de, d|a, and e|b. 


15. A prime triple is a sequence of three prime numbers of the form p, p + 2, and p + 4. For 
example, 3,5,7 is a prime triple. Prove that there are no other prime triples. 


LEVEL 5 


16. If a,b € Z* and gcd(a, b) = 1, find the following: 
(i)  gcd(a,a+1) 
(ii) gcd(a,a+2) 
(iii) gcd(3a + 2,5a + 3) 
(iv) gcd(a+b,a—b) 
(v) gcd(a + 2b,2a +b) 


17. Find the smallest ideal of Z containing 6 and 15. Find the smallest ideal of Z containing 2 and 
3. In general, find the smallest ideal of Z containing j and k, where j,k € Z. 


18. Find all subgroups of (Z, +) and all submonoids of (Z, +). 
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LESSON 13 - REAL ANALYSIS 
LIMITS AND CONTINUITY 


Strips and Rectangles 
A horizontal strip in R x R is a set of the form R x (c,d) = {(x,y) |c < y < d}. 


Example 13.1: The horizontal strips R x (- 2, 1) and R x (2.25, 2.75) can be visualized in the xy-plane 
as follows: 


Ay Ay 
3-- 3-- 
2- 2- 
1+ 1+ 
| | | | | | | 
fa aoe “a4 m4 es FS 
—|—++ ed fos 
=p —24- 
—3-+ —3-++ 
y y 
R x (-2,1) R x (2.25, 2.75) 


Similarly, a vertical strip in R x R is a set of the form (a, b) x R = {(x,y)|a<x < b}. 


Example 13.2: The vertical strips (-3,0) x R and (0.8, 1) x R can be visualized in the xy-plane as 
follows: 


Ay Ay 
& 3 
Ss 24 
I-+ ‘iat 
——_—_—_+—_ +> i. S S PS S S 
— Fee at 0 1 2 3 = a 0 § 2 3% 
-|= =|- 
o —24- 
—3-- —3-+ 
y y 
(-3,0) x R (0.8,1) x R 
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We will say that the horizontal strip R x (c, d) contains y if y E Randc < y < d. Otherwise, we will 
say that the horizontal strip excludes y. 


Similarly, we will say that the vertical strip (a,b) x R contains x if x E€ Randa < x < b. Otherwise, 
we will say that the vertical strip excludes x. 


Example 13.3: The horizontal strip R x (2.25, 2.75) contains 2.5 and excludes 3. One way to visualize 
this is to draw the horizontal lines y = 2.5 and y = 3. Below in the figure on the left, we used a solid 
line for the line y = 2.5 because it is contained in the horizontal strip and we used a dashed line for 
the line y = 3 because it is not contained in the horizontal strip. 


Similarly, the vertical strip (0.8, 1) x R contains 0.9 and excludes 2. Again, we can visualize this by 
drawing the vertical lines x = 0.9 and x = 2. These vertical lines are shown below in the figure on the 
right. 


An open rectangle is a set of the form (a,b) x (c,d) = {(x,y) |a<x<bAc<y< d}. Note that 
the open rectangle (a, b) x (c, d) is the intersection of the horizontal strip R x (c,d) and the vertical 
strip (a, b) x R. We will say that an open rectangle traps the point (x, y) if x,y € Rand (x, y) is in the 
open rectangle. Otherwise, we will say that (x, y) escapes from the open rectangle. 


Example 13.4: The open rectangle R = (-3,0) x (-2, 1) is the intersection of the horizontal strip 
H = R x (-2,1) and the vertical strip V = (- 3,0) x R. So, R = H NV. The rectangle R traps (- 1,0), 
whereas (- 2,3) escapes from R. This can be seen in the figure below on the left. 


The open rectangle R = (0.8,1) x (2.25,2.75) is the intersection of the horizontal strip 
H = R x (2.25, 2.75) and the vertical strip V = (0.8,1) x R. So, R = H N V. The rectangle R traps 
(0.9, 2.5), whereas (0.9, 2) escapes from R. This can be seen in the figure below on the right. 


Observe that in this example, | chose points that escape the given rectangles in the vertical direction. 
They fall outside the rectangle because they’re too high or too low. This is the only type of escape that 
we will be interested in here. We do not care about points that escape to the left or right of a rectangle. 
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Let A CR, let f: A > R, and let R = (a,b) x (c,d) be an open rectangle. We say that R traps f if for 
allx € (a,b), R traps (x, f (x)). Otherwise we say that f escapes from R. 


Example 13.5: Let f:R—R be defined by f(x) =x+1. Consider the open rectangles 
R = (0,2) x (1,3) and S = (0,2) x (0, 2). Then R traps f, as can be seen in the figure below on the 
left, whereas f escapes from S, as can be seen in the figure below on the right. | put a box around the 
points of the form (x, f(x)) that escape from S. For example, the point (1.2, f(1.2)) = (1.2, 2.2) 
escapes from S because 0 < 1.2 < 2, but f(1.2) = 2.2 = 2. 


When we are checking the limiting behavior near a real number r, we don’t care if the point (r, f(r)) 
escapes. Therefore, before we define a limit, we need to modify our definitions of “traps” and 
“escapes” slightly to account for this. 


Let ACR, let f:A > R, and let R = (a,b) x (c,d) be an open rectangle. We say that R traps f 
around r if for all x € (a,b) \ {r}, R traps (x, f (x)). Otherwise, we say f escapes from R around r. 
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Limits and Continuity 


Let ACR, let f: A > R, and let r, L E€ R. We say that the limit of f as x approaches r is L, written 
lim f(x) = L, if for every horizontal strip H that contains L there is a vertical strip V that contains r 
x-r 


such that the rectangle H N V traps f around r. 


Technical note: According to the definition of limit just given, in order for lim f (x) to exist, the set A 
XIF 


needs to contain a deleted neighborhood of r, say NE (r) = (r — e,r) U (r,r + €). As an example, 
suppose that A = {0} and f: A > R is defined by f (0) = 1. What is the value of lim f (x)? Well, any 
X> 


rectangle of the form H N V does not trap any points of the form (x, f) with x # 0 simply because 
f (x) is not defined when x # 0. Therefore, given a horizontal strip H, there is no vertical strip V such 
that H NV traps f around r, and so, lim f(x) does not exist. This agrees with our intuition. 

xar 


As a less extreme example, suppose that A = Q and g:Q > R is the constant function where 
g(x) = 1 for all x € Q. Then for any r € Q, we should probably have lim g(x) = 1. But if we use our 
Xr 
current definition of limit, then lim g(x) does not exist. A more general definition of limit would yield 
x>r 


finite values for limits defined on certain sets (like Q) that do not contain a neighborhood of r. 


Specifically, we really should insist only that for each j € R*, AN ((r —jrt+yj)\ {r}) + Ø. The 
definition of limit given above could be modified slightly to accommodate this more general situation. 
For example, we could change “R traps f around r” to “for all x E AN ((a, b) \ {r}), R traps 
(x, Fœ) If we were to use this more general definition, it is very important that we also insist that 
the set A has the property given at the beginning of this paragraph. Otherwise, we would have an issue 
with the function f defined at the beginning of this note. The interested reader may want to investigate 
this. 


In this lesson, we will avoid these more complicated domains and stick with the simpler definition of 
limit. Let’s just always assume that if lim f (x) exists, then f is defined on some deleted neighborhood 
x>r 


of r. 


Example 13.6: Let f: R > R be defined by f(x) =x +1, 
let r = 1.5, and let L = 1. Let’s show that lim. f(x) #1. 
x71. 


lf H = R x (0,2) and V is any vertical strip that contains 
1.5, then H N V does not trap f around 1.5. Indeed, if 


V = (a,b) X R, then if we let x = = (1.5 + b), we will show 
that x € (a,b) and f(x) = = (1.5 +b)+1> 2 (see the 
figure to the right). 

To see that x € (a,b), note that since b > 1.5, we have 
x =<5(1.5+b)>5(15+15) =Ż -3 = 1.5 > a, and we 
have x =Ż (1.5 + b) < Ż (b + b) =+: 2b =b. 
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To see that f(x) > 2, note that 


f(x) =50.5 +b) +1>520.5415)41=5-3415154+1=25>2. 


So, what is lim f(x) equal to? From the picture above, a good guess would be 2.5. To verify that this 
x>l. 

is true, let H = R x (c,d) be a horizontal strip that contains 2.5. Next, let V = (c — 1,d — 1) x R. We 

will show that H N V = (c — 1,d — 1) x (c,d) traps f around 1.5, Let x E (c— 1,d — 1) N {1.5}, SO 

that c—1<x<d-—1 and x #1.5, Adding 1 to each part of this sequence of inequalities gives 


c<x+1<d,sothatc < f(x) < d, or equivalently, f(x) € (c,d). Since x E (c — 1,d — 1) \ {1.5} 


and f(x) € (c,d), it follows that (x, f(x)) € (c — 1,d — 1) x (c,d) = H NV. Therefore, H N V traps 
f around 1.5. 


Notes: (1) The figures above give a visual representation of the argument just presented. In the figure 
on the left, we let c = 2 and d = 3, so that H = R x (2,3). Our choice of V is then (1,2) x R, and 
therefore, H NV = (1, 2) x (2,3). Now, if 1 < x < 2,then 2 < x + 1 < 3. So, (x, f (x)) EHNV. 


In the figure on the right, we started with a thinner horizontal strip without being specific about its 
exact definition. Notice that we then need to use a thinner vertical strip to prevent f from escaping. If 
the vertical strip were just a little wider on the right, then some points of the form (x, fœ) would 
escape the rectangle because they would be too high. If the vertical strip were just a little wider on the 
left, then some points of the form (x, f (x)) would escape the rectangle because they would be too 
low. 


(2) Notice that in this example, the point (1.5, f(1.5)) itself always stays in the rectangle. In the 

argument given, we excluded this point from consideration. Even if (1.5, f(1.5)) were to escape the 

rectangle, it would not change the result here. We would still have lim. f (x) = 2.5. | indicated the 
Pe ol I 


parts of the argument where (1.5, f(1.5)) was being excluded from consideration in Example 13.6 
above by placing rectangles around that part of the text. If we delete all the parts of the argument 
inside those rectangles, the resulting argument would still be correct. We will examine this situation 
more carefully in the next example. 
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If we modify the definition of limit by getting rid of “around r,” insisting that r € A, and replacing L by 
f(r), we get the definition of continuity. Specifically, we have the following definition. 


Let ACR, let f:A > R, and let r € A. We say that the function f is continuous at r if for every 
horizontal strip H that contains f (r) there is a vertical strip V that contains r such that the rectangle 
H NAV traps f. 


Example 13.7: 


1. 


»-{ 


If we delete all the text that | placed in rectangles in Example 13.6 above, then the resulting 
argument shows that the function f defined by f(x) = x + 1 is continuous at x = 1.5. 


To summarize, given a horizontal strip H containing f(1.5) = 2.5, we found a vertical strip V 
containing 1.5 such that H NV traps f. Notice once again that in this example we do not 
exclude x = 1.5 from consideration, and when we mention trapping f, we do not say “around 
1.5.” We need to trap (1.5, f(1.5)) = (1.5, 2.5) as well. 


x+1 ifx+1.5 

-2  ifx=1.5 

nearly identical to the function f we have been discussing. It differs from the previous function 

only at x = 1.5. It should follow that lim_g(x) = lim_ f(x). And, in fact it does. The same 
x1. k Soe Je I 


Let’s consider the function g: R > R defined by g(x) = { This function is 


exact argument that we gave in Example 13.6 shows that lim. g(x) = 2.5. The figures below 
x1. 
illustrate the situation. 


This time however, we cannot delete the text inside the rectangles in Example 13.6. x = 1.5 
needs to be excluded from consideration for the argument to go through. In the leftmost figure 
below, we see that if H is the horizontal strip H = R x (2,3), then for any vertical strip 
V = (a,b) x R that contains 1.5, the point (1.5, - 2) will escape the rectangle H N V. Indeed, 
H NV = (a,b) x (2,3), and (1.5, g(1.5)) = (1.5,- 2) ¢ (a,b) x (2,3) because - 2 < 2. This 
shows that g is not continuous at x = 1.5. 


x+1 ifx#1.5 V _¢xt1 ifx+15 V 
=, ifx=15 7 y={*) ifx=15 
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The strip game: Suppose we want to determine if lim f(x) = L. Consider the following game between 
x-r 


two players: Player 1 “attacks” by choosing a horizontal strip Hy containing L. Player 2 then tries to 

“defend” by choosing a vertical strip Vo containing r such that Hy N Vo traps f around r. If Player 2 

cannot find such a vertical strip, then Player 1 wins and lim f(x) + L. If Player 2 defends successfully, 
x-r 


then Player 1 chooses a new horizontal strip H, containing L. If Player 1 is smart, then he/she will 
choose a “much thinner” horizontal strip that is contained in Ho (compare the two figures above). The 
thinner the strip, the harder it will be for Player 2 to defend. Player 2 once again tries to choose a 
vertical strip V, such that H, N V; traps f around r. This process continues indefinitely. Player 1 wins 
the strip game if at some stage, Player 2 cannot defend successfully. Player 2 wins the strip game if he 
or she defends successfully at every stage. 


Player 1 has a winning strategy for the strip game if and only if lim f(x) # L, while Player 2 has a 
x-r 


winning strategy for the strip game if and only if lim f(x) = L. 
x-r 


Note that if it’s possible for Player 1 to win the strip game, then Player 1 can win with a single move— 
just choose the horizontal strip immediately that Player 2 cannot defend against. 


For example, if f(x) = x + 1, then lim_ f(x) + 1.Player1 
x1. 


can win the appropriate strip game immediately by 
choosing the horizontal strip H = R x (0,2). Indeed, if 
Player 2 chooses any vertical strip V = (a,b) X R that 
contains 1.5, let x E (a, b) with x > 1.5. Then we have 


f@) =x41>1541=2.5 >2. 


So, (x, fœ) escapes H N V. In the figure to the right, we 
see that Player 1 has chosen H = R x (0,2) and Player 2 
chose V = (a,b) x R for some a,b E R witha < 1.5 < b. 
The part of the line inside the square is an illustration of 
where f escapes H N V between a and b. Observe that no 
matter how much thinner we try to make that vertical strip, 
if it contains 1.5, then it will contain a portion of the line 
that is inside the square. 


Now, if it’s possible for Player 2 to win the game, then we need to describe how Player 2 defends 

against an arbitrary attack from Player 1. Suppose again that f(x) = x + 1 and we are trying to show 

that lim. f (x) = 2.5. We have already seen how Player 2 can defend against an arbitrary attack from 
X=. 


Player 1 in Example 13.6. If at stage n, Player 1 attacks with the horizontal strip H, = R x (a, b), then 
Player 2 can successfully defend with the vertical strip V, = (a — 1,b — 1) x R. 


Equivalent Definitions of Limits and Continuity 


The definitions of limit and continuity can be written using open intervals instead of strips. Specifically, 
we have the following: 
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Theorem 13.1: Let A € R, let f: A > R, and let r,L € R. The following are equivalent: 
1. lim f(x) = L. 
kor 


2. For every open interval (c,d) with L E (c,d), there is an open interval (a, b) with r € (a,b) 
such that whenever x E (a,b) and x + r, f(x) € (c,d). 


3. For every positive real number e€, there is a positive real number 6 such that whenever 
x €(r—6,r+6)andx #1, f(x) € L—e,L +e). 


This is the first Theorem where we want to prove more than two statements equivalent. We will do 
this with the following chain: 1 > 2 > 3 > 1. In other words, we will assume statement 1 and use it to 
prove statement 2. We will then assume statement 2 and use it to prove statement 3. Finally, we will 
assume statement 3 and use it to prove statement 1. 


Proof of Theorem 13.1: (1> 2) Suppose that lim f(x) = L and let L € (c,d). Then the horizontal strip 
xər 
R x (c,d) contains L. Since lim f (x) = L, there is a vertical strip (a, b) X R that contains r such that 
Kr 


the rectangle R = (a,b) x (c,d) traps f around r. Since the vertical strip (a,b) X R contains r, 
r E (a,b). Since the rectangle R traps f around r, for all x € (a,b) \ {r}, R traps (x, f (x)). In other 
words, whenever x E (a, b) and x # r, we have (x, fœ) E (a,b) x (c,d), and thus, f (x) € (c,d). 


(2— 3) Suppose 2 holds and let € be a positive real number. Then L — e < L < L + €, or equivalently, 
L E€ (L — e, L + €). By 2, there is an open interval (a, b) with r € (a, b) such that whenever x € (a, b) 
and x +r, we have f(x) € (L—e,L +e). Let 6 = min{r — a,b — r}. Since 6 <r — a, we have 
-ô > -(r—a)=-r +a. Therefore, r — ô > r + (-r +a) =a. Furthermore, since 6 < b — r, we 
haver+6<r+(b-—r) =b.So, (r—6,r+6) € (a,b). lfx € (r—6,r +6) andx + r, then since 
(r—6,r +6) & (a,b), x € (a,b). Therefore, f(x) € (L—e,L + €). 


(3> 1) Suppose 3 holds and H = R x (c,d) is a horizontal strip that contains L. Since c < L < d, we 
have L—c > 0 and d — L > 0. Therefore, € = min{L — c,d — L} > 0. So, there is ô > 0 such that 
whenever x E (r—6,r+6)andx + r,then f(x) E€ (L—e,L + e€). Let V = (r—6,r+6) x R. Then 
V contains r. We now show that HNV=(r—6,r+6)x (c,d) traps f around r. Let 
x € (r—6,r+6) with x +r. Then f(x) € (L—e,L +e). So, f(x) >L—e =>L—(L—c) =c and 
f(x) <L+e<L+(d-—L) =d. Therefore, f(x) € (c,d), and so, H NV traps f around r. Oo 


Notes: (1) € and 6 are Greek letters pronounced “epsilon” and “delta,” respectively. Mathematicians 
tend to use these two symbols to represent arbitrarily small numbers. 


(2) If a E R and € > 0, then the e-neighborhood of a is the interval N-(a) = (a — €,a + €) and the 
deleted €-neighborhood of a is the “punctured” interval N©(a) = (a—€,a) U (a,a + €). We can 
visualize the deleted e-neighborhood N£ (a) as follows: 


a—e a ate 
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For a specific example, let’s look at N81) = (1-2,1)U(1,14+ 2) = (-1, 1) U (1,3). 


(3) The third part of Theorem 13.1 can be written in terms of neighborhoods as follows: 
“For every positive real number e, there is a positive real number 6 such that whenever x E NE(r), 


f(x) € Ne(L).” 


(4) x E (a—€,a+€) is equivalent to a— e <x <a +e. If we subtract a from each part of this 
inequality, we get -e€ < x — a < e. This last expression is equivalent to |x — a| < €. So, we have the 
following sequence of equivalences: 


xEN-(a)@x€(a-6,ate)S a-Ee<x<ates |x—-al <e. 
(5) x # ais equivalent to x — a # O. Since the absolute value of a real number can never be negative, 


x — a + 0 is equivalent to |x — a| > 0. This can also be written 0 < |x — a|. So, we have the following 
sequence of equivalences: 


x E€ N(a) © x E€ (a—€,a) U (aa +6) & 0 < |x—-al <e. 


(6) The third part of Theorem 13.1 can be written using absolute values as follows: 

“For every positive real number e, there is a positive real number 6 such that whenever 
0< |x-—r| <6, |f@) -L| <e” 

(7) We can abbreviate the expression from Note 6 using quantifiers as follows: 


ve > 036 >0(0<|x-r| <6 -> |f(x)-—L| <6) 
We will refer to this expression as the € — 6 definition of a limit. 


For each equivalent formulation of a limit, we have a corresponding formulation for the definition of 
continuity. 
Theorem 13.2: Let A € R, let f: A > R, and let r € A. The following are equivalent: 

1. f is continuous atr. 


2. For every open interval (c,d) with f (r) E (c, d), there is an open interval (a, b) withr E (a, b) 
such that whenever x E (a, b), f(x) E€ (c,d). 


3. For every positive real number e€, there is a positive real number 6 such that whenever 
xElr-s,r +8) fæ) e970) —e7@) +e). 
4. Ve >046 >0(|x-r| <6 > |f(x) -f@)|< e). 


The proof of Theorem 13.2 is left to the reader. It is very similar to the proof of Theorem 13.1. 
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Basic Examples 
Example 13.8: Let’s use the e — ô definition of a limit to prove that lim (2x +1) =3. 
X= 


Analysis: Given € > 0, we need to find ô > 0 so that 0 < |x — 1| < 6 implies |(2x + 1) — 3| < e. First 
note that |(2x + 1) — 3| = |2x — 2| = |2(x — 1)| = |2||x — 1| = 2|x — 1|. So, |(2x + 1) — 3| < eis 
equivalent to |x — 1| < = Therefore, ô = = should work. 


Proof: Let e > 0 and let ô = > Suppose that 0 < |x — 1| < 6. Then we have 
|(2x + 1) — 3] = |2x — 2| = |2(x — 1)| = |2\|x — 1] = 2|x—1] < 26 =2-5=€. 
Since € > 0 was arbitrary, we have Ve > 0536 > 0 (0 < |x —1| < 6 > |(2x + 1) —3| <€). 


Therefore, lim (2x +1) =3. o 
x= 


Notes: (1) Even though we’re using the “e — 6 definition” instead of the “strip definition,” we can still 
visualize the situation in terms of the strip game. When we say “Let € > 0,” we can think of this as 
Player 1 “attacking” with the horizontal strip H = R x (3 — €,3 + €). In the proof above, Player 2 is 


then “defending” with the vertical strip V = (1 — 2 1+ £) x R. This defense is successful because 
when j= ae it, we have 2—e<2x<2+e, and so 3—e<2x+1<3+e, or 


equivalently, 2x + 1 E (3 — e€, 3 + €). In other words, for x € (1 — = 1+ 5), H AV traps f. 


(2) Instead of playing the strip game, we can play the € — ô game instead. The idea is the same. Suppose 
we are trying to figure out if lim f(x) = L. Player 1 “attacks” by choosing a positive number e. This is 
xr 


equivalent to Player 1 choosing the horizontal strip H = R x (L — e,L +e). Player 2 then tries to 
“defend” by finding a positive number 6. This is equivalent to Player 2 choosing the vertical strip 
V = (r—6,r+6) XR. The defense is successful if whenever x E (r—6,r+6), x +r, we have 
f(x) € (L —e,L + e). This is equivalent to H N V trapping f around r. 


The figure to the right shows what happens during one 
round of the e —6 game corresponding to checking if 
lim (2x + 1) = 3. In the figure, Player 1 chooses € = 0.5, 
Yer 3 


so that L — e = 3 — 0.5 = 2.5 and L + e€ = 3 + 0.5 = 3.5. 
Notice how we drew the corresponding horizontal strip 


H = R x (2.5,3.5). According to our proof, Player 1 
chooses ô = - = = = 0.25. So r — ô = 1 — 0.25 = 0.75 


and r +ô = 1 + 0.25 = 1.25. Notice how we drew the 
corresponding vertical strip V = (0.75,1.25) x R. Also 
notice how the rectangle H N V traps f. 
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(3) Observe that the value for 6 that Player 2 chose here is the largest value of 6 that would result in a 
successful defense. If we widen the vertical strip at all on either side, then f would escape from the 
resulting rectangle. However, any smaller value of 6 will still work. If we shrink the vertical strip, then 
f is still trapped. After all, we have less that we need to trap. 


(4) In the next round, Player 1 will want to choose a smaller value for e. If Player 1 chooses a larger 
value for €, then the same 6 that was already played will work to defend against that larger €. But for 
this problem, it doesn’t matter how small a value for e Player 1 chooses—Player 1 simply cannot win. 
All Player 2 needs to do is defend with 6 = : (or any smaller positive number). 


(5) Essentially the same argument can be used to show that the function f defined by f(x) = 2x +1 
is continuous at x = 1. Simply replace the expression 0 < |x — 1| < 6 by the expression |x — 1| <6 
everywhere it appears in the proof. The point is that f(1) = 2-1+1 = 3. Since this value is equal to 
lim (2x + 1), we don’t need to exclude x = 1 from consideration when trying to trap f. 


Example 13.9: Let’s use the e€ — 6 definition of a limit to prove that lim (x? —2x+1)=4. 
x- 


Analysis: This is quite a bit more difficult than Example 13.8. 


Given € > 0, we need to find 5 > 0 so that 0 < |x — 3| < 6 implies |(x? — 2x + 1) — 4| < e. First 
note that |(x?— 2x + 1)-— 4| = |x? — 2x — 3| = |(x — 3)(x + 1)| = |x —3||x + 1|. Therefore, 
(x? — 2x + 1) — 4| < e is equivalent to |x — 3||x + 1| < €. 


There is a small complication here. The |x — 3| is not an issue because we’re going to be choosing ô so 
that this expression is small enough. But to make the argument work we need to make |x + 1| small 
too. Remember from Note 3 after Example 13.8 that if we find a value for 6 that works, then any smaller 
positive number will work too. This allows us to start by assuming that 6 is smaller than any positive 
number we choose. So, let’s just assume that ô < 1 and see what effect that has on |x + 1]. 


Well, if 6 < 1 and 0 < |x — 3| < ô, then |x — 3| < 1. Therefore, -1 < x — 3 < 1. We now add 4 to 
each part of this inequality to get 3 < x +1 < 5. Since -5 < 3, this implies that -5 <x +1<5, 
which is equivalent to |x + 1| < 5. 


So, if we assume that 6 < 1, then |(x* — 2x + 1) — 4| = |x —3||x + 1| < 6-5 = 58. Therefore, if 
we want to make sure that |(x* — 2x + 1) — 4| < e, then is suffices to choose 6 so that 56 < e, as 


long as we also have 6 < 1. So, we will let 6 = min {1,5}. 


Proof: Let € > 0 and let 6 = min {1,5} Suppose that 0 < |x — 3| < ô. Then since 6 < 1, we have 
|x — 3] <1, and so, |x + 1| < 5 (see the algebra in the analysis above). Also, since 6 < = we have 
|x —3| o. It follows that | (x? — 2x + 1) — 4| = |x? — 2x — 3| = |x —3]|x4+1| ae 5 =<, 


Since € > 0 was arbitrary, we have Ve > 046 > 0(0 < |x -—3| <6 > |(x? — 2x + 1) — 4| < €). 
Therefore, lim (x? —2x+1)=4. o 
AF 
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Example 13.10: Let m, b E R with m + 0. Let’s use the e — 6 definition of continuity to prove that the 
function f: R > R defined by f(x) = mx + b is continuous everywhere. 


A function of the form f(x) = mx + b, where m, b € R and m # 0 is called a linear function. So, we 
will now show that every linear function is continuous everywhere. 


Analysis: Given a E R and e > 0, we will find 6 > 0 so that |x — a| < 6 implies |f (x) — f(a)| < e. 
First note that |f(x) — f(a)| = |(mx +b) — (ma + b)| = |mx — ma| = |m||x — a|. Therefore, 


If (x) — f(a)| < € is equivalent to |x — a| < 7 So, ô = im should work. 
Proof: Let a E R, let e > 0, and let 6 = = Suppose that |x — a| < 6. Then we have 
If &) — f(@)| = [mx + b) — (ma + b)| = [mx — ma] = |m||x — al < |m|d = |m| = €. 


Since € > 0 was arbitrary, we have Ve > 036 > 0 (|x — a| < 6 > |f (x) — f(a)| < e). Therefore, f 
is continuous at x = a. Since a € R was arbitrary, f is continuous everywhere. o 


Notes: (1) We proved Va E R Ve > 036 > 0 Yx E€ R(|x— a| < 8 > |f (x)— f (a)| < €). In words, 
we proved that for every real number a, given a positive real number e, we can find a positive real 
number 6 such that whenever the distance between x and a is less than 6, the distance between f (x) 
and f(a) is less than €. And of course, a simpler way to say this is “for every real number a, f is 
continuous at a,” or Va E R (f is continuous at a).” 


(2) If we move the expression Va E R next to Vx E R, we get a concept that is stronger than continuity. 
We say that a function f: A > Ris uniformly continuous on A if 


Ve > 046 > 0Va,x E A (|x -—al <6 > |f- f(a)| < e). 


(3) As a quick example of uniform continuity, every linear function is uniformly continuous on R. We 
can see this by modifying the proof above just slightly: 


New proof: Let € > 0 and let 6 = i Let a,x € R and suppose that |x — a| < 6. Then we have 


If) — f(@I| = |(mx + b) — (ma + b)| = |mx — ma] = |m||x — a] < Im|ő = |m| = 2 
Since € > 0 was arbitrary, we have Ve > 0 3ô > 0Va,x E R (|x —a| < 6 > |f(x) — f(a)| < e€). 
Therefore, f is uniformly continuous on R. 


(4) The difference between continuity and uniform continuity on a set A can be described as follows: 
In both cases, an € is given and then a 6 is chosen. For continuity, for each value of x, we can choose a 
different 6. For uniform continuity, once we choose a 6 for some value of x, we need to be able to use 
the same 6 for every other value of x in A. 


In terms of strips, once a horizontal strip is given, we need to be more careful how we choose a vertical 
strip. As we check different x-values, we can move the vertical strip left and right. However, we are not 
allowed to decrease the width of the vertical strip. 
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Try to come up with a function that is continuous on a set A, but not uniformly continuous on A. This 
will be explored a little more in the problem set below. 


Limit and Continuity Theorems 
Theorem 13.3: Let A,B CR, let f:A > R, g:B > R, let r E€ R, and suppose that lim[f(x)] and 
x-r 
lim[g(x)] are both finite real numbers. Then lim[f (x) + g(x)] = lim[f C] + lim[g (x)]. 


Analysis: If lim[f(x)] = L, then given e > 0, there is 6 >0 such that 0 < |x —r| <6 implies 
xr 
If (x) — L| < e. If lim[g(x)] = K, then given e > 0, there is 6 > 0 such that 0 < |x — r| < 6 implies 
kar 


|g(x) — K| < e. We should acknowledge something here. If we are given a single positive real value 
for €, there is no reason that we would necessarily choose the same 6 for both f and g. However, using 
the fact that once we find a 6 that works, any smaller 6 will also work, it is easy to see that we could 
choose a single value for 6 that would work for both f and g. This should be acknowledged in some 
way in the proof. There are several ways to work this into the argument. The way we will handle this is 
to use 6, for f and ô, for g, and then let 6 be the smaller of 6, and ô>. 


Next, recall from Theorem 7.3 from Lesson 7 that the Triangle Inequality says that for all x,y € R, 
|x + y| < |x| + |y|. (The theorem is stated to be true for all complex numbers, but since R € C, it is 
equally true for all real numbers.) After assuming 0 < |x — r| < ô , we will use the Triangle Inequality 
to write 


If) + g@@) — GZ +K)| = 1G¢@) — £1) + @@) — KDI Ss If @) — LI + lg@) — K| <e +e = 2e. 


It seems that we wound up with 2e on the right-hand side instead of e. Now, if € is an arbitrarily small 
positive real number, then so is 2e, and vice versa. So, getting 2€ on the right-hand side instead of € 
really isn’t too big of a deal. However, to be rigorous, we should prove that it is okay. There are at least 
two ways we can handle this. One possibility is to prove a theorem that says 2e works just as well as €. 
A second possibility (and the way | usually teach it in basic analysis courses) is to edit the original e’s, 
so it all works out to € in the end. The idea is simple. If € is a positive real number, then so is . So, after 


we are given €, we can pretend that Player 1 (in the € — 6 game) is “attacking” with = instead. Let’s see 
how this all plays out in the proof. 

Proof: Suppose that lim[f(x)] = L and lim[g(x)] = K, and let € > 0. Since lim[f(x)] = L, there is 

xr x-r xr 
6, > 0 such that 0 < |x — r| < 6, implies |f (x) — L| < J Since lim[g(x)] = K, there is 6, > 0 such 
Xr 
that 0 < |x — r| < ô implies |g(x) — L| < Let d = min{ô4, 2} and suppose that 0 < |x —r| < ô. 
Then since ô < &, |f(x) — L| < A Since ô < ô, |g(x) — K| < By the Triangle Inequality, we have 
Iœ) +g) — (L+K) = |F@) - L) + (g) -K)| < If) — LI + lg) -KI <- + s E 
So, lim[f (x) + g(x)] = L + K = lim[f(x)] + lim[g(x)]. m 
“=r x=r xr 


Theorem 13.4: Let A,B CR, let f: A > R, g:B > R, let r E€ R, and suppose that lim[f(x)] and 
x-r 

lim[g(x)] are both finite real numbers. Then lim[f (x) g(x)] = lim[f(x)] - lim[g(x)]. 

x>r x>r x>r x>r 


181 


Analysis: As in Theorem 13.3, we let lim[f(x«)] = L and lim[g(x)] = K. If € > 0 is given, we will find 
xr Xr 


a single 6 > 0 such that 0 < |x — r| < ô implies |f (x) — L| < € and |g(x) — K| < e (like we did for 
Theorem 13.3). Now, we want to show that whenever 0 < |x — r| < ô, |f (x)g(x) — LK| < e. This is 
quite a bit more challenging than anything we had to do in Theorem 13.3. 


To show that |f(x)g(x) — LK| < e we will apply the Standard Advanced Calculus Trick (SACT — see 
Note 7 following Example 4.5 from Lesson 4). We would like for |f (x) — L| and |g(x) — K| to appear 
as factors in our expression. To make this happen, we subtract Lg(x) from f(x)g(x) to get 
fdgx) — Lg(x) = F(x) — L)g(x). To “undo the damage,” we then add back Lg(x). The 
application of SACT together with the Triangle Inequality looks like this: 


If (g(x) — LKI = |(f @) g(x) — Lg(x)) + La) - LK) | 
< If@x)g(x) — Lg(x)| + [Lg@) — LK] = |f œ) — Lilga) + ILIlg@) — Kl 
< elg(x)| + [Lle = e(g@)| + ILI). 


Uh oh! How can we possibly get rid of |g(x)| + |L|? We have seen how to handle a constant multiple 
of € in the proof of Theorem 13.3. But this time we are multiplying € by a function of x. We will resolve 
this issue by making sure we choose ô small enough so that g(x) is sufficiently bounded. 


We do this by taking a specific value for €, and then using the fact that lim[g(x)] = K to come up with 
xər 


a ô > 0 and a bound M for g on the deleted -neighborhood of r. For simplicity, let’s choose € = 1. 
Then since lim[g(x)] = K, we can find 6 > 0 such that 0 < |x — r| < ô implies |g (x) — K| < 1. Now, 
xar 


lg(x)-K|<18&-1<g(x)-K<1&K-—1< g(x) <K +1. For example, if K = 5, we would 
have 4 < g(x) < 6. Since this implies - 6 < g(x) < 6, or equivalently, |g(x)| < 6, we could choose 
M = 6. If, on the other hand, K =-3, we would have -4 < g(x) <-2. Since this implies 
-4 < g(x) <4, or equivalently, |g(x)| <4, we could choose M = 4. In general, we will let 
M = max{|K —1|,|K + 1]}. 


We will now be able to get |f(x) g(x) — (LK)| < e(lg(x)| + IL|) < e(M + |L|). Great! Now it looks 
just like the situation we had in Theorem 13.3. The number M + |L| looks messier, but it is just a 
number, and so we can finish cleaning up the argument by replacing Player 1’s €-attacks by Ti 


Proof: Suppose that lim[f(x)] = L and lim[g(x)] = K, and let € > 0. Since lim[g(x)] = K, there is 
Lr x-r xr 


6, > 0 such that 0 < |x —r| < 6, implies |g(x«) — K| < 1. Now, |g(x) — K| < 1 is equivalent to 
-1 < g(x) —K <1, or by adding K, K — 1 < g(x) < K +1. Let M = max{|K — 1|, |K + 1|}. Then, 
0 < |x —r| <6, implies -M < g(x) <M, or equivalently, |g(x)| < M. Note also that M > 0. 
Therefore, M + |L| > 0. 


E 
M+\L|’ 


Since lim[g(x)] = K, there is 63 >0 such that 0 < |x —r| < ô, implies |g(x) —L| < aT Let 
Xr 


ô = min{6,, 62,63} and suppose that 0 < |x — r| < 6. Then since 6 < 6;, |g(x)| < M. Since 6 < 63, 


Now, since lim[f(x)] = L, there is 6, > 0 such that 0 < |x — r| < 6, implies |f (x) — L| < 
P i 


E€ 
M+\L|’ 


. Since ô < 63, |g(x) — K| < 


If&)-L|< Ta By the Triangle Inequality (and SACT), we have 
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If (x) g(x) — LK| = |(f @) g(x) — Lg(x)) + (Lg) — LK)| 
< If(@x)g(x) — Lg(x)| + Lg @) — LK] = |f œ) — Lilga) + ILIlg@) — KI 


za Mt elgg =m ™ TIED =e. 
So, lim[f (x) g(x)] = LK = lim[f(x)] - limlg@)}. 


Limits Involving Infinity 


Recall that a horizontal strip in R x R is a set of the form R x (c,d) = {(x,y) |c < y < d} anda 
vertical strip is a set of the form (a,b) x R= {(x,y) |a < x < b}. If we allow a and/or c to take on 
the value - œ (in which case we say that the strip contains - œ) and we allow b and/or d to take on 
the value +00 (in which case we say that the strip contains +00), we can extend our definition of limit 
to handle various situations involving infinity. 


Example 13.11: Let’s take a look at the horizontal strip R x (1,+0°0) and the vertical strip 
(-00,-2) x Rin the xy-plane. These can be visualized as follows: 


A y Ay 

2—- 

1 1-- 
i < — 
B 3 j 0 1 2 3X » 1 0 + 2 gs 

—]—- —]—- 

=g =g 

—3-+ —3-+ 

y y 

R x (1, +20) (-00,-2) xR 


The horizontal strip R x (1, +00) contains +00 and the vertical strip (- œ, - 2) x R contains - œ. 


Note: Strips that contain +00 or - © are usually called half planes. Here, we will continue to use the 
expression “strip” because it allows us to handle all types of limits (finite and infinite) without having 
to discuss every case individually. 


By allowing strips to contain +20 or - œ, intersections of horizontal and vertical strips can now be 
unbounded. The resulting open rectangles (a, b) x (c,d) can have a and/or c taking on the value - co 
and b and/or d taking on the value +00, 
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Example 13.12: Consider the horizontal strip H = R x (1,+%œ) and the vertical strip 
V = (-2,-1) x R. The intersection of these strips is the open rectangle R = (- 2,- 1) x (1, +0). The 
rectangle R traps (- 1.5, 3), whereas (- 1.5, 0) escapes from R. This can be seen in the figure below on 
the left. Also, consider the horizontal strip H = R x (1, +00) and the vertical strip V = (-00,-2) x R. 
The intersection of these strips is the open rectangle S = (- œ,- 2) x (1, +00). The rectangle S traps 
(- 3,2), whereas (- 3,- 1) escapes from S. This can be seen in the figure below on the right. 


Jy J 
e 3 3 
2 ° 2 
1 1 
S O A S P S S ee S S O E S SE 
—3 — J 0 17 2 3X mD i 0 1 2 3X 
—[+ O -j+ 
=p. =p 
—3-+ —3-+ 
y y 


When we allow +00 and - œ, the definitions of “trap” and “escape” are just about the same. We just 
need to make the following minor adjustment. 


Small technicality: If r = +00 or r = - œ, then we define R traps f around r to simply mean that R 
traps f. In other words, when checking a limit that is approaching +00 or - œ, we do not exclude any 
point from consideration as we would do if r were a finite real number. 


Example 13.13: Let f: R > R be defined by f(x) = =, let r = 0, and let L = +o. Let’s show that 
lim f(x) = +00. Let H =Rx(c,+0) be a horizontal strip that contains +00. Next, let 
x> 


V= (-= ve’ =) xR. We will show that HAV = (-= Te ) x (c,+0) traps f around 0. Let 
x e (-= TE a oe < and x # 0. Then -= <x <Oor0<x a In either case, 
x’? < Z and therefore, c <> = f (x). Since x € (-= =) \ {0} and f(x) € (c, +00), it follows that 


1 1 
(x, f(x)) € (-=,=) x (c, m = H N V. Therefore, H N V traps f around 0. So, lim f(x) = +00. 


Example 13.14: Let f: R > R be defined by f(x) = -x + 3, let r = +00, and let L = - œ. Let’s show 
that dim fo =-oo, Let H = R x (-œ,d) be a horizontal strip that contains -0o. Next, let 
V = (3 — d, +œ) x R. We will show that H N V = (3 — d, +00) x (-œ,d) traps f around +00 (or 
more simply, that (3 — d, +00) x (- œ, d) traps f). Let x E (3 — d, +00), so that x > 3 — d. Then we 
have -x < d — 3, and so, f(x) =-x +3 < d. Since x E (3 — d, +00) and f(x) € (- œ, d), it follows 
that (x, fœ) € (3 — d, +œ) x (- œ, d) = H NV. Therefore, H N V traps f. So, dm f =- 00, 
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We can find equivalent definitions for limits involving infinity on a case-by-case basis. We will do one 
example here and you will look at others in Problem 15 in the problem set below. 


Theorem 13.5: lim f(x) = +00 if and only if VM > 046 > 0 (0 < |x-r| < 8 > f(x) > M). 
xr 


Proof: Suppose that lim f (x) = +00 and let M > 0. Let H = R x (M, +00). Since lim f(x) = +00, 
x=r x=r 


there is a vertical strip V = (a, b) x R that contains r such that the rectangle (a,b) x (M, +00) traps 
f around r. Let ô = min{r — a,b — r}, and let 0 < |x —r| < ô. Then x #rand-6 <x—r<6.So, 
r—-d<x<r+d6. Since 6<r-—a, we have a <r-— ô. Since d6<b-—r, we have b>r+o. 
Therefore, a < x < b, and so, x E (a,b).Sincex #1r,x E (a,b), and (a,b) x (M, +00) traps f around 
r, we have f(x) € (M, +00). Thus, f(x) > M. 


Conversely, suppose that VM > 046 > 0 (0 < |x —r| < ô -—> f(x) > M). Let H = R x (c, +20) bea 
horizontal strip containing +20 and let M = max{c, 1}. Then there is 6 > 0 such that 0 < |x- r| <6 
implies f(x) > M. Let V = (r—6,r+6) XR and let R=HANV = (r—6,r+6) x (c, +o). We 
show that R traps f around r. Indeed, if x E (r — ôr +6) and x + r, then 0 < |x —r| < ô and so, 
f(x) > M. So, (x, f(x)) € (r — 6,r + 8) x (M, +0) E (r —6,r + 8) x (c, +o) (because c < M). 
So, R traps f around r. o 


One-sided Limits 
Let ACR, let f:A > R, and let r E€ R and L € RU {- œ, +00}. We say that the limit of f as x 
approaches r from the right is L, written lim, f(x) = L, if for every horizontal strip H that contains L 
kor 
there is a vertical strip V of the form (r, b) x R such that the rectangle H N V traps f. 


Example 13.15: Let f: R \ {1} > R be defined by f(x) = —, let r = 1, and let L = +00, Let’s show 
that lim, f(x) = +œ. Let H = Rx (c,+œ) be a horizontal strip that contains +00 and let 
x= 
M = max{1,c}. Let V = (L4 + 1) x R. We will show that H N V = (1,4 + 1) x (c, +20) traps f. 
Let x € (1,4 + 1), so that 1<x<Ż+ 1. Then we have 0< x-1 <1 and so — >M >c. So, 
M M M x-1 
f(x)>c . Since x€ (L++ 1) and f(x) € (c, +00), (x, fœ) E (Lż+ 1) x (c, +œ) = HAV. 
Therefore, H N V traps f. So, lim, f(x) = +0. 
x7 


Theorem 13.6: lim, f(x) = L (L real) if and only if ve > 0 4d > 0 (O<x-r<6->|f(x)-L| <e). 
kor 


Proof: Suppose that lim f(x) = L and let e > 0. Let H = R x (L — e, L + €). Since lim, f(x) = L, 
AT x>r 
there is a vertical strip V = (r, b) X R such that the rectangle H N V = (r,b) x (L — e, L + e) traps f . 


Let ô = b — r, and let 0 < x — r < ô. Thenr <x < b and so, x E (r, b). Since (r,b) x (L—e€,L +€) 
traps f, we have f(x) E (L — e, L + €). Thus, L — e€ < f(x) < L + e, or equivalently, |f (x) — L| < e. 


Conversely, suppose that Ve > 038 > 0 (0<x-—r <8 -> |f(x)—L|< e€).LetH = R x (c,d) bea 
horizontal strip containing L and let e = min{L — c,d — L}. Then there is ô >0 such that 
0<x-—r < ô implies |f(x)— L| < e. Let V = (r,r +8) x R and R=HAV = (r,r + ô) x (c,d). 
We show that R traps f. If x € (r,r +8), thenr < x <r +ô, or equivalently, 0 < x —r < ô. So, 
If(x)— L| <e. Therefore, -e< f(x)—L<e, or equivalently, L—e< f(x)<L+e. Thus, 
(x, f Ge) E (r,r +6) x L—e,L +e) & (r,r + 6) x (c,d) (Check this!). So, R traps f. Oo 
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Problem Set 13 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 


1. Let f: IR > R be defined by f(x) = 5x — 1. 
(i) Prove that lim f(x) = 14. 
X 


(ii) Prove that f is continuous on R. 
2. Letr,c € Rand let f: R > R be defined by f (x) = c. Prove that lim[f(«)] = c. 
x-r 


3. LetA CR, let f:A > R, letr,k € R, and suppose that lim[f (x)] is a finite real number. Prove 
Ar 
that lim[kf 0] = k lim[f (x)]. 
x-r x-r 


LEVEL 2 


4. Let ACR, let f:A > R, and let r € R. Prove that f is continuous at r if and only if 


lim[f @)] = f@). 


5. Prove that every polynomial function p: R > R is continuous on R. 


LEVEL 3 
6. Let g: RR > R be defined by g(x) = 2x? — 3x + 7. 
(i) Prove that lim g(x) = 6. 
(ii) Prove that g is continuous on R. 


7. Suppose that f, g: R > R, a E R, f is continuous at a, and g is continuous at f(a). Prove that 
g ° f is continuous at a. 


LEVEL 4 


8. Let h: R > R be defined by h(x) = 


4 


x34 : 
sz, Prove that lim h(x) = = 
+1 x2 


x 5 
9. Let k: (0,0) > R be defined by k(x) = vx. 
(i) Prove that lim k(x) = 5. 
x>25 


(ii) Prove that f is continuous on (0, ©). 


(iii) Is f uniformly continuous on (0, 00)? 
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10. Let f: R > R be defined by f(x) = x”. Prove that f is continuous on R, but not uniformly 
continuous on R. 


11. Prove that if lim[f (x)] > 0, then there is a deleted neighborhood N of r such that f(x) > 0 for 
x-r 
allx EN. 


12. Let AC R, let f: A > R, let r € R, and suppose that lim[f(x)] is a finite real number. Prove 
xr 
that there is M € R and an open interval (a,b) containing r such that |f(x)| < M for all 


x E (a,b) \ fr}. 


13. Let ACR, let f,g,h:A > R, let r E R, let f(x) < g(x) < h(x) for all x € A \ {r}, and 
suppose that lim[f (x)] = lim[h(x)] = L. Prove that lim[g(x)] = L. 
x-r XE x-rT 


LEVEL 5 


14. Let ACR, let f,g:A > R such that g(x) #0 for all x E A, let r E R, and suppose that 
lim[f(x)] and lim[g(x)] are both finite real numbers such that lim[g(x)] # 0. Prove that 
xr or Kor 

Fol lim fŒ) 

a E ~ lim g(x)’ 


xor 


15. Give a reasonable equivalent definition for each of the following limits (like what was done in 
Theorem 13.5). r and L are finite real numbers. 


© lim f(x) =-@ 
(ii) lim f@) =L 
(iii) lim f@)=L 


Gv) lim f(x) = +0 


(v) lim f(x) =-% 
(vi) Jim, f(x) = +00 
(vii) lim f(x) =-00 


16. Let f(x) =-x? +x + 1. Use the M — K definition of an infinite limit (that you came up with 
in Problem 15) to prove lim f(x) =--. 
X—7+00 
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17. Give a reasonable definition for each of the following limits (like what was done in Theorem 
13.6). r and L are finite real numbers. 


(i) lim fx) =1 
(ii) lim, f(x) = +00 
(iii) lim fx) =- 
Gv) lim f(x) = +00 
(v) lim f(x) =- 


18. Use the M — 6 definition of a one-sided limit (that you came up with in Problem 17) to prove 


that lim — =- o. 
x>3- x-3 
x+1 
19. Let f (x) = CENA Prove that 


© dim f@)=0. 
G) lim f(x) = +o. 


if x is rational. 


0 
20. Let f: R > R be defined by f (x) = { pias) meaner 


does not exist. 


Prove that for all r € R, lim[f(x)] 
x-r 
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LESSON 14 - TOPOLOGY 
SPACES AND HOMEOMORPHISMS 


Topological Spaces 


A topological space consists of a set S together with a collection of “open” subsets of S. Before we give 
the formal definition of “open,” let’s quickly review a standard example that most of us are somewhat 
familiar with. 


Consider the set R of real numbers and call a subset X of IR open if for every real number x € X, there 
is an open interval (a, b) with x E (a, b) and (a, b) © X. We were first introduced to this definition of 
an open set in Lesson 6. In that same lesson, we showed that @ and R are both open sets (Theorem 
6.4), we proved that an arbitrary union of open sets is open (Theorem 6.7), and we proved that a finite 
intersection of open sets is open (Theorem 6.9 and part (iii) of Problem 6 from that lesson). As it turns 
out, with this definition of open, every open set can be expressed as a union of open intervals (Theorem 
6.8). 


In this lesson we will move to a more general setting and explore arbitrary sets together with various 
collections of “open” subsets of these sets. Let’s begin by giving the formal definition of a topological 
space. 


Let S be a set and let J be a collection of subsets of S. J is said to be a topology on S if the following 
three properties are satisfied: 

1. ØETandSET. 

2. IfX © T, then UX ET (T is closed under taking arbitrary unions). 

3. If ¥Y ST and Y is finite, then NY ET (T is closed under taking finite intersections). 
A topological space is a pair (S,T), where S is a set and J is a topology on S. We will call the elements 


of T open sets. Complements of elements of T will be called closed sets (A is closed if and only if S \ A 
is open). 


We may sometimes refer to the topological space S. When we do, there is a topology J on S that we 
are simply not mentioning explicitly. 
Example 14.1: 
1. Let S = {a} bea set consisting of just the one element a. There is just one topology on S. It is 
the topology T = {9, {a}}. 
Note that the power set of S is P(S) = {®, {a}} and P(P(S)) = fø, {Ø}, fa}, {ø, ta} Notice 


that the topology T = {ø, {a}} is an element of P(P(S)). However, the other three elements 
of P(P(S)) are not topologies on S = {a}. 


In general, for any set S, a topology on S is a subset of P(S), or equivalently, an element of 
P(P(S)). If S # Ø, then not every element of P(P(S)) will be a topology on S. 
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For example, if S = {a}, Then @,{@}, and {{a}} are all elements of P(P(S)) that are not 
topologies on S. Ø and {Ø} fail to be topologies on {a} because they do not contain {a}, while 
fa} fails to be a topology on {a} because it does not contain Ø. 


Let S = {a, b} be a set consisting of the two distinct elements a and b. There are four topologies 
on S: T, = {0,{a, b}}, T = {Ø, {a}, {a, b3}, T = {Ø, {b}, {a b}}, and T, = {Ø, {a}, {b}, {a, b}}. 


We can visualize these topologies as follows. 


(09 


Notice that all four topologies in the figure have the elements a and b inside a large circle 
because S = {a, b} is in all four topologies. Also, it is understood that Ø is in all the topologies. 


J, is called the trivial topology (or indiscrete topology) on S because it contains only Ø and S. 
J} is called the discrete topology on S, as it contains every subset of S. The discrete topology is 
just P(S) (the power set of S). 


The topologies J3, J}, and J, are finer than the topology J, because J; © Jz, J; © Jz and 
J, S J}. We can also say that J; is coarser than 7}, J}, and Jy. Similarly, J, is finer than J, and 
J}, or equivalently, J, and J} are coarser than Jj. The topologies J, and J} are incomparable. 
Neither one is finer than the other. To help understand the terminology “finer” and “coarser,” 
we can picture the open sets as a pile of rocks. If we were to smash that pile of rocks (the open 
sets) with a hammer, the rocks will break into smaller pieces (creating more open sets), and the 
pile of rocks (the topology) will have been made “finer.” 


Note that for any set S, the discrete topology is always the finest topology and the trivial 
topology is always the coarsest. 


Let S = {a,b,c} be a set consisting of the three distinct elements a, b, and c. There are 29 
topologies on S. Let’s look at a few of them. 


We have the trivial topology J, = {9, {a, b, c}. 


If we throw in just a singleton set (a set consisting of just one element), we get the three 
topologies J, = {9, {a}, {a, b, c}, Tz = {ø, {b}, {a, b, c}, I, = {9, {c}, {a, b, cH}. 
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Note that we can’t throw in just two singleton sets. For example, {9, {a}, {b}, {a, b, c} is not a 
topology on S. Do you see the problem? It’s not closed under taking unions: {a} and {b} are 
there, but {a, b} = {a} U {b} is not! However, T; = {®, {a}, {b}, {a, b}, {a, b, c}} is a topology on 
S. 


Here are a few chains of topologies on S written in order from the coarsest to the finest 
topology (chains are linearly ordered subsets of {T | T is a topology on S}). 


{@, {a, b,c}} E {Ø, {a}, {a, b, c}} = {@, {a}, {a, b}, {a, b, c}} E {@, {a}, {b}, {a, b}, {a, b, c} 
E {Ø, {a}, {b}, {a, b}, fa, c}, fa, b, c}} E {Ø, {a}, {b}, {c}, a, b}, {b, c}, fa, c}, {a, b, c}} 


{@, {a, b, c} Cc {9, {a}, {a, b, c}} Cc {®, {a}, {b,c}, {a, b, c} Cc {9, {a}, {b}, {a, b}, {b, c}, {a, b, c} 
c {Ø, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, {a, b, c}} 


{@, {a, b, c}} E {Ø, {b, c}, {a, b, c}} E {Ø, {c}, {b, c}, {a, b, c}} S {Ø, {b}, {c}, {b, ch, {a, b, c}} 
c {Ø, {b}, {c}, {a, b}, {b, c}, {a, b, c}} € {Ø, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, {a, b, c}} 


Below is a picture of all 29 topologies on {a, b, c} (to avoid clutter we left out the names of the 
elements). Again, a large circle surrounds a, b, and c in all cases because S = {a, b, c} is in all 29 
topologies. Also, it is understood that the empty set is in all these topologies. 


| organized these topologies by the number of sets in each topology. The lowest row consists of 
just the trivial topology. The next row up consists of the topologies with just one additional set 
(three sets in total because @ and S are in every topology), and so on. 


ih daa 


Below we see a visual representation of the three chains described above. As each path moves 
from the bottom to the top of the picture, we move from coarser to finer topologies. 
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4. Let S = R and let T = {X CR| Vx € X Ja,b E R(x € (a,b) A (a,b) S X}. In other words, 
we are defining a subset of R to be open as we did in Lesson 6. That is, a subset X of R is open 
if for every real number x E X, there is an open interval (a, b) with x E (a,b) and (a,b) E X. 
By Theorem 6.4, @,R E T. By Theorem 6.7, T is closed under taking arbitrary unions. By 
Problem 6 from Problem Set 6 (part (iii)), T is closed under taking finite intersections. It follows 
that J is a topology on R. This topology is called the standard topology on R. 

5. Let S =C and let T = {X CC | Yz € X Ja E€ Car € R*(z € N,(a) AN,(a) S X}. In other 
words, we are defining a subset of C to be open as we did in Lesson 7. That is, a subset X of C 
is open if for every complex number z € X, there is an open disk (or neighborhood) N, (a) with 
z € N,(a) and N,(a) © X. By Example 7.8 (part 4), Ø, C ET. By Problem 8 in Problem Set 7 
(parts (i) and (ii)), T is closed under taking arbitrary unions and finite intersections. It follows 
that J is a topology on C. This topology is called the standard topology on C. 


Note: Recall that for a € C and r € Rt the r-neighborhood of a, written N, (a) is the open disk with 
center a and radius r. That is, N, (a) = {z € C | |z — a| < r}. See Lesson 7 for details. 


Bases 


If (S,T) is a topological space, then a basis for the topology T is a subset B © T such that every 
element of T can be written as a union of elements from B. We say that T is generated by B or B 
generates T. 


Notes: (1) Given a topological space T, it can be cumbersome to describe all the open sets in T. 
However, it is usually not too difficult to describe a topology in terms of its basis elements. 


(2) If (S,T) is a topological space and B is a basis for T, then T = {UX | X E B}. 
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(3) More generally, if X is any collection of subsets of S, then we can say that X generates 
{UX | X © X}. However, this set will not always be a topology on S. 


Example 14.2: 


1. Let S = {a,b,c} and T = {9, {a}, {b}, {a, b}, {b, c}, {a, b, ch}. The set B = {{a}, {b}, {b, c} is a 
basis for T. Indeed, we have {a} = Uf{a}, {b} = U{{b}, {a,b} = Uf{a}, {b}} = {a} U {b}, 
{b,c} = U{{b, c}, {a, b, c} = Uffa}, {b, c} = {a} U {b, c}, and Ø = UP. 

(Note that U@ = {y | there is Y € Ø with y E€ Y} = Ø. It follows that Ø does not need to be 
included in a basis.) 


We can visualize the basis B and the topology J that is generated by B as follows. 


© 


B T 


We know Ø E J (even though @ is not indicated in the picture of T) because J is a topology. 
On the other hand, it is unclear from the picture of B whether Ø € B. However, it doesn’t really 
matter. Since @ is equal to an empty union, @ will always be generated from B anyway. 


There can be more than one basis for the same topology. Here are a few more bases for the 
topology J just discussed (are there any others?): 


Bı = {{a}, {b}, {a, b}, {b, c}} B2 = {{a}, {b}, {a, b}, {b, c}, {a, b, c}} 
B3 = {O, {a}, {b}, {b,c} B, = T = {O, {a}, {b}, a, b}, {b, c}, {a, b,c} 


2. LetS = {a,b,c} and let X = {{a}, {b}}. In this case, X generates {9, {a}, {b}, {a, b}}. This set is 
not a topology on S because S = {a, b, c} is not in the set. The reason that X failed to generate 
a topology on S is that it didn’t completely “cover” S. Specifically, c is not in any set in X. 


In general, if an element x from a set S does not appear in any of the sets in a set X, then no 
matter how large a union we take from X, we will never be able to generate a set from X with 
x in it, and therefore, X will not generate a topology on S (although it might generate a 
topology on a subset of S). 


3. Let S = {a,b,c} and X = fa, b}, {b, c}. In this case, X generates {®, {a, b}, {b, c}, {a, b, c}. 
This set is also not a topology because {a, b} N {b,c} = {b} is not in the set. In other words, the 
set is not closed under finite intersections. 


In general, if there are two sets A and B in X with nonempty intersection such that the 
intersection A N B does not include some nonempty set in X, then the set generated by X will 
not be closed under finite intersections, and therefore, X will not generate a topology on S. 
Note that A N B itself does not necessarily need to be in X. However, there does need to be a 
set C withC GANBandCEX. 
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Parts 2 and 3 from Example 14.2 show us that not every collection X of subsets of a set S is the basis 
for a topology on S. Let’s see if we can find conditions on a collection X of subsets of S that will 
guarantee that X is a basis for a topology on S. 


We say that X covers S if every element of S belongs to at least one member of X. Symbolically, we 
have 
Vx ESAAE XM EA). 


We say that X has the intersection containment property on S if every element of S that is in the 
intersection of two sets in X is also in some set in X that is contained in that intersection. 


Vx €SVA,BEX(x EANB>ACEX(XECACCANB)). 


Example 14.3: 


1. Once again, let S = {a,b,c}. Xi = {{b}, {b, ch} does not cover S because a € S does not belong 
to any member of X4. X does have the intersection containment property—the only element 
of {b} N {b,c} = {b} is b, and b E {b} € X; and {b} is contained in {b} N {b, c}. Notice that the 
set that X, generates is {Ø, {b}, {b, c}. This set is not a topology on S because S = {a,b,c} is 
not in this set. However, it is a topology on {b, c}. 


Xz = ffa, b}, {b, c}} covers S, but does not have the intersection containment property. 
Indeed, {a, b} N {b,c} = {b} and b E {b}, but {b} € X2. Notice that the set that X, generates 
is {ø, {a,b}, {b,c}, {a, b, c}. This set is not a topology on S because {a, b} N {b,c} = {b} is not 
in the set. 


B= {{b}, {a, b}, {b, c} covers S and has the intersection containment property. The set that B 
generates is the topology T = {9, {b}, {a, b}, {b, c}, {a, b, c}. 


We can visualize the sets X4, X2, B, and T as follows. 


0 OO 


Xi x B T 


2. Let S = Rand let B = {(a,b) | a,b E RAa < b} be the set of open intervals with endpoints 
in R. B covers R because if x € R, then x E (x —1,x + 1) € B. B also has the intersection 
containment property. Indeed, if x € R and (a,b),(c,d) € B with x € (a,b) N (c,d), then 
x E (a,b) N (c,d) = (e, f), where e = max{a,c} and f = min{b, d} (see part (ii) of Problem 
6 from Problem Set 6) and (e, f) € B. In fact, B is a basis on R that generates the standard 
topology of R. 
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To see that B generates the standard topology on R, let T be the standard topology on R and 
let T” be the topology generated by B. First, let X E€ T. By Theorem 6.8, X can be expressed as 
a union of bounded open intervals. So, X € J’. Since X € T was arbitrary, T S J’. Now, let 
X € TJ’. Then X is a union of bounded open intervals, say X = UY. Let x E X. Since X = UY, 
x E UY. So, x E (a,b) for some (a,b) E Y. Since (a,b) E Y, (a,b) E UY = X. Therefore, 
X €T.Since X € T’ was arbitrary, T’ & T. Since T GT’ andJ’CT,TJ' =T. 


3. Let S = R and let X = {(-œ,b) | b € R} U {(a, œ) |a € R}. X covers R because if x E R, 
then x E (x — 1,00) € X. However, X does not have the intersection containment property. 
For example, 0 E (-œ, 1) N (- 1, œ) = (- 1, 1), but there is no set in X contained in (-1, 1). 
The set generated by X is X U {(-œ,b) U (a,œ0) | a,b E RAD < a} U {@, R}. This set is not 
closed under finite intersections, and therefore, it is not a topology on R. 


Based on the previous examples, the next theorem should come as no surprise. 


Theorem 14.1: Let S be a nonempty set and let B be a collection of subsets of S. B is a basis for a 
topology on S if and only if B covers S and B has the intersection containment property on S. 


Note: The set generated by B is {UX | X © B}. This set can also be written in the alternative form 
{AS S|YxEAJB EB(x € BAB & A)}. You will be asked to verify that these two sets are equal in 
Problem 6 below. We will use this alternative form of the set generated by B in the proof of Theorem 
14.1. 


Proof of Theorem 14.1: Suppose that B covers S and B has the intersection containment property on 
S. The set generated by B is T = {A S S |Yx E€ AJB E B(x E€ BAB CE A)}. Let’s check that T is a 
topology on S. 


Since A = Ø vacuously satisfies the condition Yx € AAB E B(x E BAB CA), wehave@ ET. 
To see that S E J, let x E S. Since B covers S, there is B E€ B such that x E€ B and B E S. So, SET. 


Let X © J and let x € UX. Then there is A € X with x E A. Since X ET, A E T. So, there is BEB 
such that x E€ B and BCA. Since BEA and AG UX, BE UX. It follows that the condition 
Vx € UX 3B E B(x E€ BAB S UX) is satisfied. So, UX ET. 


We now prove by induction on n E N that for n = 2, the intersection of n sets in T is also in T. 


Base Case (n = 2): Let Ay, A, E€ J and let x € A, N A3. Then there are B4, By E B with x E B4, x E Bo, 
Bı S A; and B, © Az. Since x E€ B, and x E B, x E By, N By. Since B has the intersection containment 
property, there is C E€ B such that x € C and C € B, N B3. Since B4 © A; and By © Az, C S A, NAD. 
Therefore, A; NA, ET. 


Inductive Step: Suppose that the intersection of k sets in T is always in T. Let Aj, A>, ..., Ax, Akai ET. 
By the inductive hypothesis, Ay N A2 N =: N A, ET. If we let C = Ay NA N+ N Ak and D = Ak41, 
then we have C, D E J. By the base case, CN D E T. It follows that 


A, NA, NN Ax N Aggy = (Ay NA NN Ag) NAg = CADET. 
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Since @,S € J,T is closed under arbitrary unions, and J is closed under finite intersections, it follows 
that J is a topology. By the note following the statement of Theorem 14.1, B generates J. 


Conversely, suppose that B is a basis for a topology J on S. Since J is a topology on S, S E€ T. Since B 
is a basis for T, S = UX for some X & B. Let x E S. Then x E€ UX. So, there is A E X with x E A. 
Since X E B, A E B. Since x E S was arbitrary, B covers S. 


Let x E A; N Az, where 44, A, E B. Then 44, A E T, and since T is a topology on S, A; N Az ET. 
Since B is a basis for T, Ay N Az = UX for some X C B. It follows that x € UX. So, there is C € X 
with x E C. Since X € B,C E B. Also, C © UX = A, N A3. Since A4, A, € B and x E S were arbitrary, 
B has the intersection containment property. m 


Example 14.4: 


1. If S is any set, then B = {S} is a basis for the trivial topology on S. Note that {S} covers S and 
{S} has the intersection containment property on S (there is just one instance to check: 
SONS =SandS E {S}). 


2. If S is any set, then B = {{x} |x E€ S} is a basis for the discrete topology on S. B covers S 
because if x E S, then {x} € B and x E {x}. B vacuously has the intersection containment 
property because B is pairwise disjoint. 


3. Let S = R and let B = {(a,b) | a,b E RAa < b}. We saw in Example 14.3 (part 2) that B 
covers R and that B has the intersection containment property on R. It follows that B is a basis 
for a topology on R. In fact, we already saw in the same Example that B generates the standard 
topology on R. 


The basis B just described is uncountable because R is uncountable and the function f: R > B 
defined by f(r) = (r,r + 1) is injective. Does R with the standard topology have a countable 
basis? In fact, it does! Let B' = {(a, b) | a,b E Q Aa < b}. In Problem 9 below you will be asked 
to show that B’ is countable and that B’ is a basis for R with the standard topology. 


4. We saw in part 3 of Example 14.3 that X = {(- œ, b) | b € R} U {(a, œ) | a E R} does not have 
the intersection containment property. It follows from Theorem 14.1 that X is not a basis fora 
topology on R. 


However, B* = {(a,œ) | a E R} covers R and has the intersection containment property. 
Therefore, B* is a basis for a topology T* on R. Since every set of the form (a, œ) is open in the 
standard topology, J* is coarser than the standard topology on R. Since no bounded open 
interval is in T*, we see that T* is strictly coarser than the standard topology on R. 


Note: Although the set X = {(- œ, b) | b € R} U {(a,~) | a € R} is not a basis for a topology on R, if 
we let B be the collection of all finite intersections of sets in X, then B does form a basis for R (because 
X covers R). In this case, we call X a subbasis for the topology generated by B. Since every bounded 
open interval is in B, it is not hard to see that B generates the standard topology on R. We can also 
say that the standard topology on R is generated by the subbasis X. 
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Types of Topological Spaces 


A topological space (S,7) is a Ty-space (or Kolmogorov space) if for all x,y E S with x # y, there is 
U ET such that eitherx € Uandy € U orx €U andye U. 


In other words, in a Tg-space, given any two elements, there is an open set that b 
contains one of the elements and excludes the other. In the picture on the right we © 
see two typical elements a and b in a Tọ-space. We have drawn an open set 

containing a and excluding b. There does not need to be an open set containing b 

and excluding a (although there can be). 


Example 14.5: 


1. Let S = {a,b} where a + b. S together with the trivial topology {®, {a, b}} is not a Ty-space. In 
fact, the trivial topology on any set with more than one element is not a T9-space. 


{a, b} together with the discrete topology {@, {a}, {b}, {a, b}} is a Tọ-space because the open 
set {a} satisfies a € {a} and b € {a}. In fact, the discrete topology on any set is a Ty-space. 


The other two topologies on {a, b} are also Ty-spaces. For example, {®, {a}, {a, b}} is a To-space 
because {a} is open, a E {a} and b ¢ {a}. 


2. Let S = Rand let T be the topology generated by the basis {(a, œ) | a E€ R}. Then (S,T) is a 
Tp-space. If a,b € R witha < b, then U = (a, œ) is an open set with b E U anda ¢ U. 


a b 


3. If (S,T) is a To-space and J” is finer than J, then (S,J’) is also a Ty-space. Indeed, if U E T 
with x E U andy ¢ U, then since J” is finer than T, we have U E J”. 


For example, since the standard topology on R is finer than the topology generated by 
{(a, œ) | a E R}, R together with the standard topology on R is a Ty-space. 


A topological space (S,T) is a T4-space (or Fréchet space or Tikhonov space) if for all x,y E€ S with 
x + y, there are U,V E€ T such that x E€ U andy ¢ U and x EV andy EV. 


In the picture on the right we see two typical elements a and b in a T;-space. We 
have drawn an open set containing a and excluding b and an open set containing b 
and excluding a. These two open sets do not need to be disjoint. The smaller dots in 
the picture are representing some elements of the space other than a and b. 


Example 14.6: 


1. S = {a,b} together with the discrete topology {®, {a}, {b}, {a, b}} is a T,-space because the 
open sets {a} and {b} satisfy a E {a}, b ¢ {a}, b E {b}, and a ¢ {b}. In fact, the discrete 
topology on any set is a T,-space. 


It should be clear from the definitions that every T,-space is a Tg-space. It follows that the trivial 
topology on any set with more than one element is not a T,-space. 
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The other two topologies on {a,b} are not T,-spaces. For example, 


{9, {a}, a, b}} is not a T,-space because the only open set containing b also b 
© 


contains a. 


In fact, the only topology on any finite set that is T, is the discrete topology. 

To see this, let T be a topology on a finite set X that is T, and let x € X. For 

each y E X with y + x, there is an open set U, such that x € Uy and y ¢ Uy. It follows that 
U= N{U, |y EXAy Æ x} is open and it is easy to see that U = {x}. So, T is generated by the 
one point sets, and therefore, T is the discrete topology on X. 


2. LetS = Rand let T be the topology generated by the basis {(a, œ) | a E R}. Then (S, T) is not 
a T,-space. To see this, let x,y € R with x < y. Let U be an open set containing x, say 
U = (a, œ). Since x < yanda < x, we havea < y, and so, y E U. Therefore, there is no open 
set U with x E U and y ¢ U. 


It’s worth noting that the topology generated by {(a, œ) | a € R} is {(a, œ) | a € R} U {@, R}. 


3. LetS = R and let T be the topology generated by the basis B = {X € R | R \ X is finite}. T is 
called the cofinite topology on R. I leave it to the reader to verify that B generates a topology 
on R that is strictly coarser than the standard topology (Problem 3 below). It’s easy to see that 
(S,T) is a T,-space. Indeed, if a,b € R with a + b, then let U = R \ {b} and V = R \ {a}. 


4. If (S,T) isa T,-space and J’ is finer than J, then (S,T') is also a T,-space. Indeed, if U,V E T 
with x € U and y ¢ U and x € V and y E V, then since T” is finer than J, we have U,V E J". 


For example, since the standard topology on R is finer than the cofinite topology on R, it follows 
that R together with the standard topology on R is a T,-space. 


Theorem 14.2: A topological space (S,T) is a T,-space if and only if for all x € S, {x} is a closed set. 


Proof: Let (S, T) be a topological space. First, assume that (S,T) is a T}-space and let x € S. For each 
y E S with y + x, there is an open set U, with y € Uy and x ¢ Uy. Then U = U{U, lyESAy # x} 
is open (because U is a union of open sets). Let’s check that {x} = S \ U. Since x ¢ Uy for all y # x, 
x ¢ U. So, x E S \ U. It follows that {x} © S \ U. If z ES \ U, then z € U. So, for all y # x, z ¢ Uy. 
Thus, for all y # x, z # y. Therefore, z = x, and so, z E {x}. So, S \ U © {x}. Since {x} © S \ U and 
S \ U © {x}, we have {x} = S \ U. Since U is open, {x} = S \ U is closed. 


Conversely, suppose that for all x € S, {x} is a closed set. Let x, y E S with x + y, let U = S \ {y}, and 
let V = S \ {x}. Then U and V are open sets such that x € U and y ¢ U and x ¢ V and y E V. So, (S,T) 
is a T,-space. oO 


A topological space (S, T) is a T2-space (or Hausdorff space) if for all x,y E S with x + y, there are 
U,V ET withx €U,yEV,and UNV =Ø. 


In the picture on the right we see two typical elements a and b in a T,-space. We 
have drawn disjoint open sets, one including a and the other including b. The smaller 
dots in the picture represent some elements of the space other than a and b. 
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Example 14.7: 


1. The discrete topology on any set is a T>-space. Indeed, if a and b are distinct points from a 
T,-space, then {a} and {b} are disjoint open sets. 


It should be clear from the definitions that every T>-space is a T,-space. It follows that except 
for the discrete topology, every other topology on a finite set is not a T,-space. 


2. The topological space (R, T), where T is the cofinite topology on R (see part 3 of Example 14.6) 
is not a T>-space. Indeed, if U and V are open sets containing a,b € R, respectively, then 
R\ (UNV) = (R\ UV) U(R\YV) (this is De Morgan’s law), which is finite. So, UNV is 
infinite, and therefore, nonempty. 


3. The standard topologies on R and C are both T,. The same argument can be used for both 
(although the geometry looks very different). 


Let S = Ror Cand let x,y E S. Let € = d(x,y) = = |x — y|. Then U = N,(x) and V = N,(y) 
are disjoint open sets with x E U andy E V. 


In the picture below, we have drawn two typical real numbers x and y on the real line and then 
separated them with the disjoint neighborhoods U = N,(x) and V = N-,(y). 


U = NG) V = NQ) 


In the picture to the right, we have drawn two 
typical complex numbers x and y in the complex 
plane and then separated them with the disjoint 
neighborhoods U = N,(x) and V = N; (y). 


Note once again that neighborhoods on the real 
line are open intervals, whereas neighborhoods in 
the complex plane are open disks. 


4. If (S,T) isa T>-space and J” is finer than J, then 
(S,J') is also a T,-space. Indeed, if U,V € T with 
x €U, yEV, and UNV = Ø, then since J" is 
finer than T, we have U,V € J’. Let’s look at an 
example of this. 


Let K =f Inez*}, B={(a,b)|a,be€ RAa<b}U{(a,b)\Klabe€RAa <b}. In 
Problem 4 below, the reader will be asked to verify that B is a basis for a topology Jz on R. 
Since Jz contains every basis element of the standard topology on R, we see that J; is finer 
than the standard topology. It follows that (R, Jg) is a T>-space. 
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A topological space (S,T) is a T3-space (or Regular space) if (S,T) is a T,-space and for every x E S 
and closed set X with X & S \ {x}, there are U,V E T withx E U,X GS V,and UNV = Ø. 


In the picture on the right we see a typical element a and a closed set K in a 
T3-space. We have drawn disjoint open sets, one including a and the other 
containing K. The smaller dots in the picture represent some elements of the 
space other than a that are not included in K. (Note that we replaced an arbitrary 
closed set X with the specific closed set K, and similarly, we replaced x with a.) 


Example 14.8: 


1. The discrete topology on any set S is a T3-space. Indeed, if x € S and A is any subset of S \ {x} 
(all subsets of S are closed), simply let U = {x} and V = A (all subsets of S are also open). 


Some authors call a set clopen if it is both open and closed. If S is given the discrete topology, 
then all subsets of S are clopen. 


2. Every T3-space is a T-space. This follows easily from the fact that a T3-space is a T,-space and 
Theorem 14.2. It follows that except for the discrete topology, every other topology on a finite 
set is not a T3-space. 


3. The standard topologies on R and C are both T}. This follows from Problem 14 below. 


4. Consider the T-space (R, Jz) from part 4 of Example 14.7. Recall that K = E [nE z*} and 
(R, Jk) has basis B = {(a,b) |a,bE RAa< b}U {(a,b)\K|a,b€ RAa < b}. Let x =0 
andA = K.R\K = (-~,0) U [(- 1,1) \ K] U (1, œ), which is a union of three open sets, thus 
open. Therefore, K is a closed set in this topology. Let U be an open set containing 0 and let V 
be an open set containing K. For some € > 0, (0,€) \ K & U. By the Archimedean Property of 


R, there is nEN with n> =, or equivalently, 1< e. There is 0< ô< e-= such that 


G- ô,- + 5) CV. Let r be an irrational number in (+++ 5). r € UNV and therefore, 
U NV # Ø. Since we cannot separate 0 and K with open sets, (R, Jz) is not a T3-space. 


Unlike To, T;, and T,-spaces, T3-spaces are not closed under upward refinement. In other 
words, if (S,T) is a T3-space and J” is finer than J, then (S, T”) is not necessarily a T-space. 
The topological space (R, Jg) proves this. 


Also, since (R,T) is T4, where J is the standard topology on R, but (R, Jk) is not, the two 
topological spaces cannot be the same. It follows that Jz is strictly finer than the standard 
topology on R. 


A topological space (S, T) is a T4-space (or Normal space) if (S, T) is a T,-space 
and for every pair X, Y of disjoint closed subsets of S, there are U,V € J with 
XCU,Y CV,and UNV =@. 


In the picture on the right we see two closed sets K and L in a 7,-space. We have 
drawn disjoint open sets, one containing K and the other containing L. The 
smaller dots in the picture represent some elements of the space not included in 
K or L. (Note that we replaced the arbitrary closed sets X and Y with specific closed sets K and L.) 
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Example 14.9: 


1. The discrete topology on any set S is a T,-space. Indeed, if A and B are disjoint closed subsets 
of S, then A and B are also disjoint open subsets of S (because all subsets of S are both open 
and closed). 


Every T4-space is a T3-space. This follows easily from the fact that a T,-space is a T,-space and 
Theorem 14.2. It follows that except for the discrete topology, every other topology on a finite 
set is not a 74-space. 


2. The standard topologies on R and C are both T,. This follows immediately from Problem 14 
below. 


3. In Problem 15 below, you will see a T3-space that is not a T,-space. 


The definitions of Tọ, Ta, Tz, T3, and T; are called separation axioms because they all involve 
“separating” points and/or closed sets from each other by open sets. 


We will now look at two more types of topological spaces that appear frequently in mathematics. 
A metric space is a pair (S,d), where S is a set and d is a function d: S x S > R with the following 
properties: 

1. Forallx,y € S, d(x, y) = 0 if and only if x = y. 

2. Forallx,y E€ S, d(x,y) = d(y,x). 

3. Forall x,y,z E€ S, d(x,z) < d(x,y) + d(y, z). 


The function d is called a metric or distance function. It is a consequence of the definition that for all 
x E S, d(x, x) 2 0. You will be asked to prove this in Problem 2 below. 


If (S, d) is a metric space, a € S, and r € RY, then the open ball centered at a with radius r, written 
B,(a) (or B, (a; d) if we need to distinguish this metric from other metrics), is the set of all elements 
of S whose distance to a is less than r. That is, 


B,(a) = {x E S | d(a,x) < r}. 


The collection B = {B,(a) |a E S Ar € R*} covers S. Indeed, if a E S, then d(a,a) = 0 < 1, and so, 
a E B,(a). 


Pe A 
AE k =r- adla, x) 


Also, the collection B = {B,(a)|a€SAre€Rtthas the 
intersection containment property. To see this, let 
x E B,(a) N B,(b) and k = minfr — d (a, x), s — d (b, x)}. 
We have x € B(x) because d(x,x)=0 < k. Now, let 
y E Bg(x). Then d(x,y) < k. So, we have 


d(a,y) < d(a,x) + d(x,y) < d(a,x) +k 
< d(a,x)+r-—d(a,x) =r. 


So, y € B,(a). A similar argument shows that y € B,(b). So, 
y E B.(a) NB, (bd). It follows that B,(x) © B.(a) N B,(b). 
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This verifies that B has the intersection containment property. 


Since the collection of open balls covers S and has the intersection containment property, it follows 
that this collection is a basis for a topology on S. 


Note: Open balls can be visualized as open intervals on the real line IR, open disks in the Complex Plane 
C (or RÊ), or open balls in three-dimensional space R?. 


When proving theorems about metric spaces, it’s usually most useful to visualize open balls as open 
disks in C. This does not mean that all metric spaces look like C. The visualization should be used as 
evidence that a theorem might be true. Of course, a detailed proof still needs to be written. 


This is exactly what we did when we drew the picture above. That picture represents the open balls 
B,(a) and B,(b) as intersecting open disks. Inside this intersection, we can see the open ball B, (x). 
The reader may also want to draw another picture to help visualize the triangle inequality. A picture 
similar to this is drawn to the right of Note 1 following the proof of Theorem 7.4 in Lesson 7. 


A topological space (S, T) is metrizable if there is a metric d: S X S > R such that T is generated from 
the open balls in (S, d). We also say that the metric d induces the topology J. 


Example 14.10: 


1. (C,d) is a metric space, where d: C x C > R is defined by d(z, w) = |z — w|. Let’s check that 
the 3 properties of a metric space are satisfied. Property 3 is the Triangle Inequality (Theorem 
7.3 and Problem 4 in Problem Set 7). Let’s verify the other two properties. Let z = a + bi and 


w = c + di. Then d(z,w) = |z — w| = y (a — c)? + (b — d)?. So, d(z,w) = 0 if and only if 
(a — c)? + (b — d)? = 0 if and only if (a — c)? + (b — d)? = 0 if and only if a — c = 0 and 
b — d = 0 if and only if a = c and b = d if and only if z = w. So, property 1 holds. We have 
d(z,w) = |z—w| = |- (w =- z)| = |- 1(w -= z)| = |-1||w - z| = 1|w - z| = d({w, 2). 
Therefore, property 2 holds. 


If z E Candr E R+, then the open ball B,.(z) is the set B,(z) = {w E C | |z — w| < r}. This is 
just an open disk in the complex plane, as we defined in Lesson 7. 


Since the collection of open disks in the complex plane generates the standard topology on C, 
we see that C with the standard topology is a metrizable space. 


2. Similarly, (R, d) is a metric space, where d: R X R > R is defined by d(x, y) = |x — y|. The 
proof is similar to the proof above for (C, d). 


In this case, the open ball B, (a) is the open interval (a — r,a + r). To see this, observe that we 
have 


B,(a) = {x ER| |x -—al <r}={xER|]-r<x-a<r} 
={xEeR|a-—r<x<a+t+r}=(a-r,a+r). 


Since the collection of bounded open intervals of real numbers generates the standard topology 
on R, we see that R with the standard topology is a metrizable space. 
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3. 


Define the functions d, and d, from C x C to R by d,(z,w) = |Re z — Re w| + |Im z — Im w| 
and d,(z,w) = max{|Re z — Re w], |Im z — Im w|}. In Problem 7 below, you will be asked to 
verify that (C, d,) and (C, d3) are metric spaces that induce the standard topology on C. 


So, we see that a metrizable space can be induced by many different metrics. 
The open balls B, (a; d1) and B, (a; dz) are both interiors of squares. For example, the unit open 
ball in the metric d4 is B1 (0; d4) = {w E C | d,(0,w) < 1} = {w E C | [Re wļ| + [Im w] < 1}, 
which is the interior of a square with vertices 1, i, - 1, and - i. Similarly, the unit open ball in the 
metric d is B41 (0; d2) = {w E C | d,(0,w) < 1} = {w E C | max{|Re w|, |Im w|} < 1}, which 
is the interior of a square with vertices 1 + i,-1+i,-1—i,and1—i. 
i -+i 
A tot Se TR ae 
Bı (0; d4) A ` aes, B,(0; d2) | 
L 


We can turn any nonempty set S into a metric space by defining d: S x S > R by 


d(x,y) = i 


Properties 1 and 2 are obvious. For Property 3, let x,y,z E S. If x = z, then d(x,z) = 0, and 
so, d(x,z) = 0 < d(x,y) + d (y, z). If x + z, then d(x, z) = 1. Also, y cannot be equal to both 
x and z (otherwise y=xAy=z—>x =z). So, d(x,y) =1 or d(y,z) =1 (or both). 
Therefore, d(x,y) + d(y,z) > 1 = d(x,z). 


ifx=y 
ifx +y 


ifr > 1, then B(x) = S and if 0 < r < 1, then B, (x) = {x}. It follows that every singleton set 
{x} is open and therefore, (S, d) induces the discrete topology on S. 


Let (S,T) be a topological space. A collection C of subsets of S is a covering of S (or we can say that C 
covers S) if UC = S. If C consists of only open sets, then we will say that C is an open covering of S. 


A topological space (S, T) is compact if every open covering of S contains a finite subcollection that 
covers S. 


Example 14.11: 


1. 


If S is a finite set, then for any topology T on S, (S,T) is compact. After all, any open covering 
of S is already finite. 


If S is an infinite set and J is the discrete topology on S, then (S,T) is not compact. Indeed, 
{{x} | xE S} is an open covering of S with no finite subcollection covering S. 


(R,T), where T is the standard topology on R, is not compact. Indeed, {(n,n + 2) | n € Z} is 
an open covering of R with no finite subcollection covering R. 
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4. The topological space (R, T), where T is the cofinite topology on R (see part 3 of Example 14.6) 
is compact. To see this, let C be an open covering of R, and let Ay be any set in C. Then R \ Ag 
is finite, say R \ Ag = {@4, a3, ..., an}. For each i = 1,2,...,n, let A; E C with a; E A;. Then the 
collection {Ag, Aj, Az, ..., An} is a finite subcollection from C that covers R. 


There is actually nothing special about R in this example. If S is any set, we can define the 
cofinite topology on S to be the topology J generated from the basis {X © S | S \ X is finite}. 
If we replace R by S in the argument above, we see that the topological space (S, J) is compact. 


Continuous Functions and Homeomorphisms 


lf f:X >Y and A E X, then the image of A under f is the set f[A] = {f (x) | x € A}. Similarly, if 
B CY, then the inverse image of B under f is the set f~*[B] = {x E X | f(x) € B}. 


Let (X,J) and (Y, U) be topological spaces. A function f: X > Y is continuous if for each V € U, we 
have f-7[V] ET. 


Notes: (1) In words, a function from one topological space to another is continuous if the inverse image 
of each open set is open. 


(2) Continuity of a function may depend just as much on the two given topologies as it does on the 
function f. 


(3) As an example of Note 2, if X is given the discrete topology, then any function f:X >Y is 
continuous. After all, every subset of X is open in X, and therefore every subset of X of the form 
f~*[V], where V is an open set in Y, is open in X. 


(4) As another example, if X = {a,b} is given the trivial topology, and Y = X = {a,b} is given the 
discrete topology, then the identity function iy: X — Y is not continuous. To see this, just note that {a} 
is open in Y (because every subset of Y is open), but iy*({a}) = {a} is not open in X (because {a} + Ø 
and {a} # X). 


(5) Constant functions are always continuous. Indeed, let b € Y and suppose that f: X — Y is defined 
by f(x) = b forallx € X. Let B CY. If b € B, then f~+[B] = X and if b ¢ B, then f~*[B] = Ø. Since 
X and @ are open in any topology on X, f is continuous. 


(6) If B is a basis for U, then to determine if f is continuous, we need only check that for each V € B, 
we have f~?[V] € J. To see this, assume that for each V € B, we have f~*[V] E J, and let O E€ U. 
Since B is a basis for U, O = UX, for some subset X of B. So, f~1[0] = f-*[UX] = Uff 7 [V] | V € X} 
(by part (ii) of Problem 1 below). Since J is a topology, it is closed under taking arbitrary unions, and 
therefore, U{fF [V] |VEX} ET. 


Similarly, if S is a subbasis for U, then to determine if f is continuous, we need only check that for each 
V E€ S, we have f~*[V] € J. To see this, let’s assume that for each V E S, we have f~*[V] € T and let 
B be the collection of all finite intersections of sets in S. Then B is a basis for U. Let A E€ B. Then 
A = NX for some finite subset X of S. So, f~*[A] = f-*[NX] = N{f-1[V] | V © X} (Check this!). 
Since T is a topology, it is closed under taking finite intersections, and so, N{f~*[V] | V EX} ET. 
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Example 14.12: 


1. Let (A,J) and (B,U) be the topological spaces with sets A = {a,b} and B = {1,2,3} and 
topologies J = {9, {a}, {a, b}} and U = {®, {1,2}, {1,2, 3}}. The function f: A > B defined by 
f(a) = 1 and f(b) = 3 is continuous because f~*[{1, 2}] = {a}, which is open in (A, T). On 
the other hand, the function g:A > B defined by g(a) = 3 and g(b) = 1 is not continuous 
because g~*[{1, 2}] = {b}, which is not open in (A, T). We can visualize these two functions as 
follows: 


continuous not continuous 


2. Consider (R,T) and (R,U), where T is the standard topology on R and U is the topology 
generated by the basis {(a, œ) | a E€ R}. To avoid confusion, let’s use the notation Ry and Ry 
to indicate that we are considering R with the topologies J and U, respectively. The identity 
function i,: Ry > Ry is continuous because i; *[(a, 0)] = (a, œ) is open in (R, T) for every 
a E R. However, the identity function i>: Ry > Ry is not continuous because (0,1) is open in 
(R, T), but iz*[(0, 1)] = (0, 1) is not open in (R, U). 

3. Consider (R,T) and (S, U), where T is the standard topology on R, S = {a,b,c}, and U is the 

: ; _ fb ifx<0. 
topology {®, {a}, {a, b}, {a, b, c}. The function f:R —> S defined by f(x) = £ ifx>0 l 

continuous because f~ *[{a}] = Ø and f~*[{a, b}] = (- œ, 0) are both open in (R, T). 

If we replace the topology U by the topology V = {ø, {c}, {a, b, c}, then the same function f is 

not continuous because f ~*[{c}] = [0, 0), which is not open in (R, T). 


Let (X,J) and (Y, U) be topological spaces. A function f:X > Y is continuous at x € X if for each 
V € U with f(x) E€ V, there is U € T with x € U such that f[U] E V. 


Example 14.13: 
1. Consider the functions f and g from part 1 of Example 14.12. They are pictured below. 


continuous not continuous 


Let’s check that f is continuous at a. There are two open sets containing f(a) = 1. The first 
one is {1, 2}. The set {a} is open and f[{a}] = {1} £ {1, 2}. The second open set containing 1 
is {1, 2,3}. We can use the open set {a} again because f[{a}] = {1} © {1,2,3}. Alternatively, 
we can use the open set {a, b} because f[{a, b}] = {1,3} © {1, 2, 3}. 
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Let’s also check that f is continuous at b. The only open set containing f (b) = 3 is {1, 2,3}. We 
have b € {a,b} and f [{a, b}] = {1,3} © {1, 2, 3}. 

The function g is continuous at a because the only open set containing g(a) = 3 is {1, 2, 3} and 
we have a € {a} and g[{a}] = {3} © {1,2,3}. 

The function g is not continuous at b. The open set {1, 2} contains g(b) = 1. However, the only 
open set containing b is {a, b} and g|{a, b}| = {1,3} £ {1, 2}. 


; x ifx <0 
2. Define f: R > R by f(x) = ae 1 ifx>0° 


that f(0) = 1 € (0,2) and if 0 E (a,b), then f[(a, b)] = (a,0) U [1,b +1) £ (0, 2) because 
> E (a, 0), so that 5 < 0, and therefore, > ¢ (0,2). 


Then f is not continuous at 0. To see this, note 


If a > 0, then f is continuous at a. To see this, let (c,d) be an open interval containing 
f(a)=a+1.Thenc<a+1 < d,andso,c—1 <a < d-— 1. Let k = max{0,c — 1}. Then 
we have k <a < d — 1. So, a E (k,d — 1). Since k > 0, f[(k,d — 1)] = (k + 1, d). We now 
show that (k + 1,d) E (c,d). Let y E (k + 1,d). Then k+1<y<d. Since k>c—1, 
k +1 > c.Thus,c < y < d, and therefore, y E (c, d). It follows that f[(k,d — 1)] E (c, d). 


Also, if a < 0, then f is continuous at a. To see this, let (c, d) be an open interval containing 
f(a) =a.Thenc <a < d. Letk = min{0, d}. Then we have c < a < k. So, a E (c, k). Finally, 
note that f[(c, k)] = (c, k) S (c,d). 


We will see in Theorem 14.4 below that if f: R —> R, where R is given the standard topology, 
then the topological definition of continuity here agrees with all the equivalent definitions of 
continuity from Lesson 13. 


Theorem 14.3: Let (X,T) and (Y, U) be topological spaces and let f:X > Y. Then f is continuous if 
and only if f is continuous at each x E X. 


Proof: Let (X, T) and (Y, U) be topological spaces and let f: X > Y. First, suppose that f is continuous. 
Let x € X and let V € U with f(x) E€ V. Since f is continuous, f~1[V] € T. If we let U = f7t[V], then 
by part (i) of Problem 1 below, we have f [U] = flr] cV. 


Conversely, suppose that f is continuous at each x € X. Let V € U. If f-7[V] = Ø, then f-7[V] ET 
because every topology contains the empty set. If f~*[V] # Ø, let x € f~*[V]. Then f(x) € V. So, 
there is U, € T with x E€ U, such that f[U,] G V. Let U = U{U, | x € f~1[V]}. Since U is a union of 
open sets, U € T. We will show that U = f~*[V]. Let z € U. Then there is x E€ X with z € U,,. So, we 
have f(z) € f[U,]. Since f[U,.] S V, f(z) € V. Thus, z € f~*[V]. Since z € U was arbitrary, we have 
shown that U C f~1[V]. Now, let z € f~+[V]. Then f(z) E€ V. So, z E€ U,. Since U, S U, we have 
z E€ U. Since z € f~*[V] was arbitrary, we have shown that f~*[V] GU. Since U & f~1*[V] and 
fV] SU, we have U = f7?[V]. m 


We now give an € — 6 definition of continuity for metrizable topological spaces. 


Theorem 14.4: Let (X,J7) and (Y, U) be metrizable topological spaces where J and U are induced by 
the metrics d and p, respectively. f: X > Y is continuous at x € X if and only if for all € > 0 there is 
ô > 0 such that d(x,y) < 6 implies e(f), f)) <E. 
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Proof: Let (X, T) and (Y, U) be topological spaces with corresponding metrics d and p and let x E X. 


First, suppose that f: X > Y is continuous at x € X and let € > 0. f(x) E€ Be(f(x)) and B.(f(x)) is 
open in U. Since f is continuous at x, there is U € T with x E U such that f[U] © B.(f(x)). Since the 
open balls form a basis for U, we can find 6 > 0 such that Bs(x) © U (Why?). It follows that 
F(Bs(x)] S f[U] and so, f[Bs(x)] S B.(f(x)). Now, if d(x,y)<6, then y€B;5(x). So, 
fO) € fIBs(x)]. Since f[Bs(x)] E Be(f(x)), we have f(y) € Be(F(x)). So, p(f(), FY) < €. 


Conversely, suppose that for all € > 0 there is 6 > 0 such that d(x, y) < 6 implies P(f (x), fy)) <E. 
Let VEU with f(x) €V. Since the open balls form a basis for V, there is € > 0 such that 
f(x) E€ B(f(x)) and B.(f(x)) SV (Why?). Choose 6>0 such that d(x,y) <6 implies 
p(f (x), fy) < €. Let U = Bs(x). Then U E€ T and x E U. We show that f[U] CV. Let y € f [U]. 
Then there is z € U with y = f(z). Since z € U = Bs(x), d(x,z) < 6. Therefore, P(f (x), F(2)) <E. 
So, f (Z) € Be(f (x)). Since Be(f(x)) E V, f(z) E€ V. Since y = f(z), we have y E V, as desired. o 


Note: If we consider a function f: R > R with the metric d(x, y) = |x — y|, Theorem 14.4 shows that 
all our definitions of continuity given in Lesson 13 are equivalent to the topological definitions given 
here. 


Let (X, T) and (Y, U) be topological spaces. A function f: X > Y isa homeomorphism if f is a bijection 
such that O € T if and only if f [0] € U. 


Notes: (1) If f: X > Y is a bijection, then every subset V & Y can be written as f [0] for exactly one 
subset O & X. If f is also continuous, then given O € X with f [0] € U, we have O = fitol] ET. 
Conversely, suppose that f is a bijection such that for every subset O of X, f [0] € U implies O E T. 
Then, given VEU, since there is OSX with V=f[O], by our assumption, we have 
fE] = fitol = 0 €E T, showing that f is continuous. It follows that f is a continuous bijection 
if and only if f is a bijection such that VO © X(f [0] EU > 0 ET). 


(2) Similarly, f:X > Y is a bijective function with continuous inverse f~t: Y > X if and only if f is a 
bijection such that VO © X(O ET > f[0] € U). 


(3) Notes 1 and 2 tell us that f: X > Y is a homeomorphism if and only if f is a continuous bijective 
function with a continuous inverse. 


(4) Since ahomeomorphism is bijective, it provides a one to one correspondence between the elements 
of X and the elements of Y. However, a homeomorphism does much more than this. It also provides a 
one to one correspondence between the sets in J and the sets in U. 


(5) A homeomorphism between two topological spaces is analogous to an isomorphism between two 
algebraic structures (see Lesson 11). From the topologists point of view, if there is a homeomorphism 
from one space to another, the two topological spaces are indistinguishable. 


We say that two topological spaces (X,T) and (Y,U) are homeomorphic or topologically equivalent 
if there is a homeomorphism f: X > Y. 
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Example 14.14: 


1. Let S = {a,b}, T= {@, {a}, {a, b}, and U = {Ø, {b}, {a, b}}. The map f:S —> S defined by 
f(a) = b and f(b) =a is a homeomorphism from (S,T) to (S,U). Notice that the inverse 
image of the open set {b} € U is the open set {a} € T. This shows that f is continuous. 
Conversely, the image of the open set {a} € J is the open set {b} € U. This shows that f~t is 
continuous. Since f is also a bijection, we have shown that f is ahomeomorphism. On the other 
hand, the identity function g:S >S defined by g(a)=a and g(b)=b is not a 
homeomorphism because it is not continuous. For example, the inverse image of the open set 
{b} € U is the set {b} which is not in the topology J. We can visualize these two functions as 
follows: 


€ © 


homeomorphism not a homeomorphism 


Notice that f and g are both bijections from S to S, but only the function f also gives a one to 
one correspondence between the open sets of the topology (S,T) and the open sets of the 
topology (S, U). 


The homeomorphism f shows that (S,T) and (S, U) are topologically equivalent. So, up to 
topological equivalence, there are only three topologies on a set with two elements: the trivial 
topology, the discrete topology, and the topology with exactly three open sets. 


2. Let S = {a,b,c}, T = {Ø, {b}, {a, b}, {a,b, c}, and U = {9, {a, b}, {a, b, c}. Then the identity 
function f: S > S is a continuous bijection from (S, T) to (S, U). Indeed, the inverse image of 
the open set {a,b} € U is the open set {a,b} € T. However, f is not a homeomorphism 
because f+ is not continuous. The set {b} is open in J, but its image f [{b}] = {b} is not open 
in U. 


3. We saw in part 3 of Example 14.1 that there are 29 topologies on a set with three elements. 
However, up to topological equivalence, there are only 9. Below is a visual representation of 
the 9 distinct topologies on the set S = {a, b, c}, up to topological equivalence. 


OOOO@O@O® 


The dedicated reader should verify that each of the other 20 topologies are topologically 
equivalent to one of these and that no two topologies displayed here are topologically 
equivalent. 
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4. Consider R together with the standard topology. Define f: R > R by f(x) = 2x + 3. Let’s 
check that f isa homeomorphism. If x + y, then 2x + 2y,and so, 2x + 3 + 2y + 3. Therefore, 


vx, y E R(x #y > f(x) # f(y)). That is, f is injective. Next, if y E€ R, let x = = Then 
f(x) =f (=) = 2 (=) +3 = (y — 3) + 3 = y. So, Yy € R ax € R(f (x) = y). That is, f is 


surjective. Now, let (a, b) be a bounded open interval. f~*[(a, b)] = (=,= 


So, f is continuous. Also, f[(a, b)] = (2a + 3,2b + 3), which is open. So, f7t is continuous. 
Since f is a continuous bijection with a continuous inverse, f is a homeomorphism. 


), which is open. 


5. Consider (R,T) and (R,U), where T is the standard topology on R and U is the topology 
generated by the basis {(a, œ) | a € R}. We saw in part 2 of Example 14.12 that the identity 
function i: Ry > Ry is continuous because i~*[(a,0)] = (a, ©) is open in (R,T) for every 
a E€ R. However, this function is not a homeomorphism because i™t is not continuous. For 


example, (0,1) is open in (R, T), but i[(0, 1)] = (0, 1) is not open in (R, U). 


A topological property or topological invariant is a property that is preserved under homeomorphisms. 
More specifically, we say that property P is a topological property if whenever the topological space 
(S,J) has property P and (X, U) is topologically equivalent to (S, T), then (X, U) also has property P. 


In Problem 5 below, you will be asked to show that compactness is a topological property. As another 
example, let’s show that the property of being a T>-space is a topological property. 


Theorem 14.5: Let (S, T) be a T>-space and let (X, U) be topologically equivalent to (S, J). Then (X, U) 
is a T>-space. 


Proof: Let (S,T) be a T,-space and let f:S —> X be a homeomorphism. Let x,y E X with x # y. Since 
f is bijective, there are z,w E S with z # w such that f(z) = x and f(w) = y. Since (S,J) is a 
T2-space, there are open sets U,VET with zEU, wEV, and UNV =Ø. Since f is a 
homeomorphism, f[U], f[V] € U. We also have x = f(z) e f[U] and y = f(w) e f[V]. We show 
that f[U] N f[V] = Ø. If not, there is c € f[U] N f[V]. So, there are a € U and b E V with f(a) = c 
and f(b) = c. So, f(a) = f(b). Since f is injective, a = b. But then a E UNV, contradicting that 
UNV = Ø. It follows that f[U] N f[V] = Ø. Therefore, (X, U) is a T>-space. Oo 


The dedicated reader might want to show that each of the other separation axioms (To through T,) are 
topological properties and that metrizability is a topological property. 
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Problem Set 14 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 


1. 


Let f: A > B and let X be a nonempty collection of subsets of B. Prove the following: 
(i) ForanyV eX, f[f-*[V]] SV. 
Gi) f~*[UX] = UEV] |V € xX}. 


2. Let (S, d) be a metric space. Prove that for all x € S, d(x, x) = 0. 


LEVEL 2 


3. 


Prove that B = {X € R | R \ X is finite} generates a topology T on R that is strictly coarser than 
the standard topology. T is called the cofinite topology on R. 


Let K = E Ine Z*},B = {(a,b) |a,b E RAa < b}u {(a,b) \ K | a,b E€ R Aa < b}. Prove 
that B is a basis for a topology Jx on R that is strictly finer than the standard topology on R. 


LEVEL 3 


5. 


Let (K,T) and (L, U) be topological spaces with (K,T) compact and let f:K > L be a 
homeomorphism. Prove that (L, U) is compact. 


Let S be a nonempty set and let B be a collection of subsets of S. Prove that the set generated by 
B, {UX | X S B}, is equal to {A E S| Vx €EAABE B(x EBAB EC AD}. 


Define the functions d, and d, from C x C to R by d,(z,w) = |Re z — Re w| + |Im z — Im w| 
and d,(z,w) = max{|Re z — Re w|, |Im z — Im w|}. Prove that (C, d4) and (C, d2) are metric 
spaces such that d4 and d, induce the standard topology on C. 


Let (S, T) be a topological space and let A © S. Prove that J, = {A N X | X € T} is a topology 
on A. Then prove that if B is a basis for T, then B4 = {A N B| B € B} is a basis for Jy. Jy is 
called the subspace topology on A. 


LEVEL 4 


9. Let B' = {(a,b)| a,b E QAa < b}. Prove that B’ is countable and that B’ is a basis for a 


topology on R. Then show that the topology generated by B’ is the standard topology on R. 


10. Let (S,T) be a T2-space and A CS. Prove that (A, T4) is a J>-space (see Problem 8 for the 


definition of J,). Determine if the analogous statement is true for T3-spaces. 
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11. Let (S1, T1) and (S2, T3) be topological spaces. Let B = {U x V | U E€ T, AV E J5}. Prove that B 
is a basis for a topology J on S; X S3, but in general, B itself is not a topology on Sı X S2. Then 
prove that if B, is a basis for J, and B, is a basis for J}, then € = {U x V | U € B, AV E Bo} is 
a basis for J. The topology T is called the product topology on S4 X S3. 


LEVEL 5 


12. Let (S1, T1) and (Sz, J) be T>-spaces. Prove that S4 X S, with the product topology (as defined 
in Problem 11) is also a T,-space. Determine if the analogous statement is true for T3-spaces. 


13. Let T, be the set generated by the half open intervals of the form [a, b) with a, b € R. Show that 
T, is a topology on R that is strictly finer than the standard topology on R and incomparable with 
the topology Jk. 


14. Prove that every metrizable space is T4. 


15. Consider the topological space (R, J;,). Prove that R? with the corresponding product topology 
(as defined in Problem 11) is a T3-space, but not a T,-space. 


16. Let (S1, T1) and (S2, T2) be metrizable spaces. Prove that S4 X Sz with the product topology is 
metrizable. Use this to show that (R, T; ) is not metrizable. 
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LESSON 15 - COMPLEX ANALYSIS 
COMPLEX VALUED FUNCTIONS 


The Unit Circle 


Recall from Lesson 7 that a circle in the Complex Plane is the set of all points that are at a fixed distance 
(called the radius of the circle) from a fixed point (called the center of the circle). 


The circumference of a circle is the distance around the circle. 


If C and C’ are the circumferences of two circles with radii r and r’, respectively, then it turns out that 
c c! . Ci fi ra . 
— = —. In other words, the value of the ratio =e is independent of the circle that we use to 
2r 2r 2(radius) 

form this ratio. We leave the proof of this fact for the interested reader to investigate themselves. We 


H 1 daw C . 
call the common value of this ratio m (pronounced “pi”). So, we have z7 or equivalently, 
C = 27r. 


Example 15.1: The unit circle is the circle with radius 1 and center 
(0, 0). The equation of this circle is |z| = 1. If we write z in the 


standard form z = x + yi, we see that |z| = ./x? + y?, and so, 
the equation of the unit circle can also be written x? + y? = 1. 
To the right is a picture of the unit circle in the Complex Plane. 


The circumference of the unit circle is 2m - 1 = 27. 


An angle in standard position consists of two rays, both of which 
have their initial point at the origin, and one of which is the 
positive x-axis. We call the positive x-axis the initial ray and we 
call the second ray the terminal ray. The radian measure of the 
angle is the part of the circumference of the unit circle beginning 
at the point (1, 0) on the positive x-axis and eventually ending at the point on the unit circle intercepted 
by the second ray. If the motion is in the counterclockwise direction, the radian measure is positive and 
if the motion is in the clockwise direction, the radian measure is negative. 


Example 15.2: Let’s draw a few angles where the terminal ray lies along the line y = x. 
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Observe that in the leftmost picture, the arc intercepted by the angle has a length that is one-eighth of 


the circumference of the circle. Since the circumference of the unit circle is 27 and the motion is in the 


. . r g 21 TT 
counterclockwise direction, the angle has a radian measure of a T 


Similarly, in the center picture, the arc intercepted by the angle has a length that is seven-eighths of 


the circumference of the circle. This time the motion is in the clockwise direction, and so, the radian 
71 


F 7 
measure of the angle is = 8 2n = EE 
In the rightmost picture, the angle consists of a complete rotation, tracing out the entire circumference 
of the circle, followed by tracing out an additional length that is one-eighth the circumference of the 


circle. Since the motion is in the counterclockwise direction, the radian measure of the angle is 


21T 8m TT 97T 
2r +—=—+-=-—. 
8 4 4 4 


Let’s find the point of intersection of the unit circle with the terminal ray of the angle . that lies along 


the line with equation y = x (as shown in the leftmost figure from Example 15.2 above). If we call this 
point (a, b), then we have b = a (because (a, b) is on the line y = x) and a? + b? = 1 (because (a, b) 
is on the unit circle). Replacing b by a in the second equation gives us a? + a? = 1, or equivalently, 


Vi _ 


1 z 7 è 1 1 
2a? = 1. So, aê = = The two solutions to this equation are a = +i =+—==4— 


V2 
Since b = a, we also 


72 Gs . From the picture, 
it should be clear that we are looking for the positive solution, so that a = a 


1 f . wed fo E 
have b = F Therefore, the point of intersection is (5.3). 


Notes: (1) The number = can also be written in the form . To see that these two numbers are equal, 


observe that we have 
1_ 1, 1 v2_ 1v2 _ v2 
Vo 42 qian VIA 2 


(2) In the figure below on the left, we see a visual representation of the circle, the given angle, and the 
desired point of intersection. 
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(3) In the figure above on the right, we have divided the Complex Plane into eight regions using the 
lines with equations y = x and y = - x (together with the x- and y-axes). We then used the symmetry 
of the circle to label the four points of intersection of the unit circle with each of these two lines. 


If 0 (pronounced “theta”) is the radian measure of an angle in standard position such that the terminal 
ray intersects the unit circle at the point (x, y), then we will say that W(@) = (x, y). This expression 
defines a function W:R > R x R called the wrapping function. Observe that the inputs of the 
wrapping function are real numbers, which we think of as the radian measure of angles in standard 
position. The outputs of the wrapping function are pairs of real numbers, which we think of as points 
in the Complex Plane. Also, observe that the range of the wrapping function is the unit circle. 


We now define the cosine and sine of the angle 8 by cos 8 = x and sin @ = y, where W(@) = (x,y). 
sinô y 


For convenience, we also define the tangent of the angle by tan 0 = aT y 


Notes: (1) The wrapping function is not one to one. For example, W (=) =(0;1) and W (=) = (0,1). 
However, a a There are actually infinitely many real numbers that map to (0,1) under the 


wrapping function. Specifically, W E + 2kr) = (0, 1) for every k E Z. 


In general, each point on the unit circle is the image of infinitely many real numbers. Indeed, if 
W (0) = (a,b), then W (0 + 2kr) = (a,b) forall k € Z. 


(2) The wrapping function gives us a convenient way to associate an angle @ in standard position with 
the corresponding point (x, y) on the unit circle. It is mostly used only as a notational convenience. We 
will usually be more interested in the expressions cos 0 = x andsin@ = y. 


Example 15.3: Using the rightmost figure above, we can make the following computations: 


"O-Ga Otra VEO OG 


m 1 T 1 3m 1 3m 1 
cos — = — sin — = — cos — = -— sin — = — 
4° V2 4 V2 4 y A V2 
51 1 . BE 1 71 1 71 1 
cos — = -—= sin — = -—= cos — = — sin — = -—= 
4 2 4 2 4 v2 4 y2 


It’s also easy to compute the cosine and sine of the four quadrantal angles 0, d T, and = Here we use 
the fact that the points (1, 0), (0,1), (- 1,0), and (0, - 1) lie on the unit circle. 


W(0) = (1,0) w(5)=@n W(x) = (-1,0) w(S)=@-0 


T T 
cos0=1 sin0 = 0 cos—-=0 sin-=1 
2 2 
31 TT 
cost =-1 sint=0 (a= 0 sin=- =-1 
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Also, if we add any integer multiple of 27 to an angle, the cosine and sine of the new angle have the 


+ This is 


on T 8r T T 
same values as the old angle. For example, cos ZT = cos E + =) = cos E + 2r) = COST = =. 


a direct consequence of the fact that W(@ + 2km) = W(@) for all k € Z. 


We can also compute the tangent of each angle by dividing the sine of the angle by the cosine of the 
angle. For example, we have 


1 
TT sin ~ Ta 
tan— = 4 N24 
4 cos% —= 
V2 
Similarly, we have 

tan—=-1 tan =1 tan—=-1 tan0=0 tanr=0 
any = - any = any = - an0 = ana = 


When 0 = zor ns tan @ is undefined. 


Notes: (1) If z = x + yi is any complex number, then the point (x, y) lies on a circle of radius r centered 


at the origin, where r = |z| = x? + y?. If @ is the radian measure of an angle in standard position 
such that the terminal ray intersects this circle at the point (x, y), then it can be proved that the cosine 
and sine of the angle are equal to cos 8 = “and sin 0 = > 


(2) It is standard to use the abbreviations cos? 0 and sin? 8 for(cos 0)? and(sin 0)?, respectively. 


From the definition of cosine and sine, we have the following formula called the Pythagorean Identity: 
cos? 0 + sin? 0 = 1 

(3) Also, from the definition of cosine and sine, we have the following two formulas called the Negative 

Identities: 


cos(- 0) = cos 0 sin(- 0) = -sin 0. 


Theorem 15.1: Let 6 and ¢ be the radian measures of angles A and B, respectively. Then we have 
cos(8 + @) = cos 8 cos ġ — sin 8 sin 
sin(@ + $) = sin 8 cos ġ + cos 8 sin ©. 


Notes: (1) The two formulas appearing in Theorem 15.1 are called the Sum Identities. You will be asked 
to prove Theorem 15.1 in Problem 14 below (parts (i) and (v)). 


(2) Theorem 15.1 will be used to prove De Moivre’s Theorem (Theorem 15.2) below. De Moivre’s 
Theorem provides a fast method for performing exponentiation of complex numbers. 


(3) 0 and @ are Greek letters pronounced “theta” and “phi,” respectively. These letters are often used 
to represent angle measures. We may sometimes also use the capital versions of these letters, © and 
@, especially when insisting that the radian measures of the given angles are between - 7 and 7. 


215 


Exponential Form of a Complex Number 


The standard form (or rectangular form) of a complex number z is 
z = x + yi, where x and y are real numbers. Recall from Lesson 7 that 
we can visualize the complex number z = x + yi as the point (x, y) in 
the Complex Plane. 


(x,y) or(r, 9) 


lf for z + 0, we letr = |z| = |x + yi] = x? + y? and we let 8 be the 
radian measure of an angle in standard position such that the terminal 
ray passes through the point (x, y), then we see that r and 6 determine 
this point. So, we can also write this point as (r, 0). 


In Note 1 following Example 15.3, we saw that cos 0 = Zand sin@ = = By multiplying each side of the 


last two equations by r, we get x = r cos 0 and y = r sin 0. These equations allow us to rewrite the 
complex number z = x + yi in the polar form z = r cos 0 + risin@ = r(cos@ + isin 0). 


If we also make the definition et? 


the exponential form z = re’®. 


= cos 0 + isin 0, we can write the complex number z = x + yi in 


Recall from Lesson 7 that r = |z| is called the absolute value or modulus of the complex number. We 
will call the angle 0 an argument of the complex number and we may sometimes write 0 = arg z. 


Note that although r = |z| and 0 = arg z uniquely determine a point (1, 8), there are infinitely many 
other values for arg z that represent the same point. Indeed, (r, 0 + 2kr) represents the same point 
for each k E€ Z. However, there is a unique such value © for arg z such that -m < © < m. We call this 
value © the principal argument of z, and we write © = Arg Z. 


Notes: (1) The definition et? = cos 6 + isin 0 is known as Euler’s formula. 


(2) When written in exponential form, two complex numbers z = ret? and w = se’? are equal if and 
only ifr = s and ġ = 0 + 2km for some k E Z. 


Example 15.4: Let’s convert the complex number z = 1 + i to exponential form. To do this, we need 
to find r and 0. We have r = |z| = V1? + 12 = V1 +1 = V2. Next, we have tan@ = Z = 1. It follows 


MT. 
that 0 = A So, in exponential form, we have Z = V2er'. 


Note: 7 is the principal argument of z = 1 + i because - m < 5 < m. When we write a complex number 
in exponential form, we will usually use the principle argument. 


If z € C, we define z? to be the complex number z - z. Similarly, zZ? = z - z + z = z? - z. More generally, 
for z E€ Candn E Z we define z” as follows: 


=i. 


e Forn=0,z2"=2 
e Forn EZt,z"™! = z7".z. 


e Forn€Z,2"=(@")* = -z 
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Due to the following theorem, it’s often easier to compute z” when z is written in exponential form. 
Theorem 15.2 (De Moivre’s Theorem): For all n € Z, (e#)" = ei(nd) 
Proof: For n = 0, we have (e'9)" = (cos 0 +ising)® = 1 = e? = ei), 
We prove De Moivre’s Theorem for n € Z* by induction on n. 
Base Case (k = 1): (ei) =e = gift?) 
Inductive Step: Assume that k > 1 and (ei) = e(k9) We then have 
(e8) = (cos 0 + isin@)**! = (cos 8 + i sin 8)¥(cos0 + isin 0) = (ec!) (coso + isin 6) 
= eiO) (cos 0 + isin 8) = (cosk@ + isink@)(cos@ + isin@) 


= [(cos k@)(cos 0) — (sin k@)(sin 6)] + [(sin k@)(cos 0) + (cos k@)(sin @) i. 
= cos((k + 1)@) + sin((k + 1)@) i (by Theorem 15.1) = ei((k+18) 


By the Principle of Mathematical Induction, (e!9)" = e'(9) for alln € Z*. 


Ifn < 0, then 
(e!)" = = as 1 = ee ee 
(ei) eine) cos(-n@) + isin(-n8) 
1 
~ cos(n@) — isin(n@) (by the Negative Identities) 


B 1 cos(n0) +isin(n@)  cos(n0) + isin(né) 
~ cos(n@) —isin(n@) cos(n@)+isin(n@)  cos?(n8) + sin?(n0) 
= cos(n@) + isin(n@) (by the Pythagorean Identity) = e!™®). o 
Note: De Moivre’s Theorem generalizes to all n E C with a small “twist.” In general, the expression 


n n r 

fe") may have multiple values, whereas e!("® takes on just one value. However, for all n € C, 
i n Fi 3 ; n 

(e!) = e!(®) in the sense that et™®? is equal to one of the possible values of (e'?) i 


z . 4 
As a very simple example, let 0 = 0 and n = Z, Then et™®) = e? = 1 and (eey" = 12, which has two 
values: 1 and - 1 (because 17 = 1 and (- 1)? = 1). Observe that elm) is equal to one of the two 
possible values of (y. 
We will not prove this more general result here. 


Example 15.5: Let’s compute (2 — 2i)°. If we let z = 2 — 2i, we have tan 0 = Z =- 1, sothat 0 = = 


(Why?). Also, r = |z| = 22 + (-2)2 = /22(1+ 1) = V22- 2 = V22? - V2 = 2V2. So, in exponential 


TT. 
form, z = 2V2e™7", and therefore, 
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71,\6 71, 71. 217, f 

z6 = (242e) =2642* (e2) = 64: 8e lF) = 5120 2% = 51zel2*1™")! 
Ti T ` T ; 
= 512e2' = 512 (cos i: isin) = 512(0+i-1) = 512i. 


Recall that a square root of a complex number z is a complex number w such that z = w? (see Lesson 


7). More generally, if z E€ C and n E Z*, we say that w E Cis an nth root of z if z = w”. 


Suppose that z = re’? and w = se’? are exponential forms of z, w € C and that w is an n™ root of z. 
Let’s derive a formula for w in terms of r and 6. 


We have w” = s"(e'#)" = send) Since z = w”, ret? = ste!) So, s” = r and nod = 0 + 2km, 
2k: 


0 0 (9, 2k 
where k € Z. Therefore, s = Vr and = + M fork E€ Z. Thus, w = airea n ) ke EZ. 
lf k>n, then Epa ep E l ee ge e y A g, 

n n n n n n n n n 


n 
(0 2k (@ 2(k-n)r 
i| —+— i(—+ 

and therefore, e a n ) =e G n ) 
0 2kr 


It follows that there are exactly n distinct n" roots of z given byw = Gta). k =0,1,..,n—1. 
The principal n™ root of z, written Vz, is Urea. where-72 <0<T. 

Example 15.6: Let’s compute all the eighth roots of 1 (also called the 8th roots of unity). If 1 = w”, 
then w = Uata) = gai for k = 0,1, 2,3,4, 5,6,7. Substituting each of these values for k into 
the expression a gives us the following 8 eighth roots of unity. 

1 Laer i secs -1 eee -i meee 
‘42 NE 42 a 2 2 WR A 


Note: Notice how the eight 8th roots of unity are uniformly distributed on the unit circle. 


Functions of a Complex Variable 


We will be considering functions f: A > C, where A € C. If z E A, then f(z) = w for some w E C. 


If we write both z and w in standard form, then we have z = x + yi and w = u + vi for some real 
numbers x, y,u, and v. Note that the values of u and v depend upon the values of x and y. It follows 
that the complex function f is equivalent to a pair of real functions u, v: R? > R. That is, we have 


f(z) =f + yi) =u, y) + ivy). 
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If we write z in the exponential form z = ret, we have f(z) = f(re'®) = u(r,0) + iv(7, 9). 


Notes: (1) If f: A > C, z =x + yi and f(z) = u + vi, then the function f takes the point (x, y) in the 
Complex Plane to the point (u, v) in the Complex Plane. 


Compare this to a real-valued function, where a point x on the real line is taken to a point y on the real 
line. The usual treatment here is to draw two real lines perpendicular to each other, label one of them 
the x-axis and the other the y-axis. This forms a plane and we can plot points (x, f (x)) in the usual 
way. 


With complex-valued functions, we cannot visualize the situation in an analogous manner. The 
problem is that a visualization using this method would require us to plot points of the form (x, y, u, v). 
So, we would need a four-dimensional version of the two-dimensional plane, but humans are capable 
of perceiving only three dimensions. Therefore, we will need to come up with other methods for 
visualizing complex-valued functions. 


(2) One way to visualize a complex-valued function is to simply stay in 
the same plane and to analyze how a typical point moves or how a 
certain set is transformed. For example, let f: C > C be defined by 
f(z) =z -— 1. Then the function f takes the point (x, y) to the point 
(x — 1,y). That is, each point is shifted one unit to the left. Similarly, if 
S G C, then each point of the set S is shifted one unit to the left by the 
function f. Both these situations are demonstrated in the figure to the 
right. 


This method may work well for very simple functions, but for more complicated functions, the method 
in Note 3 below will usually be preferable. 


(3) A second way to visualize a complex-valued function is to draw two separate planes: an xy-plane 
and a uv-plane. We can then draw a point or a set in the xy-plane and its image under f in the 
uv-plane. Let’s see how this works for the function f defined by f(z) = z — 1 (the same function we 
used in Note 2). 
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Example 15.7: 
1. Let f(z) =zti. 
If we write z = x + yi, then we have f(x + yi) =x +yi+i=x+(y+1)i. 
So, u(x,y) = x and v(x,y) =y +1. 


Geometrically, f is a translation. It takes any point (x, y) in the 
Complex Plane and translates it up one unit. For example, the 
point (1, 2) is translated to (1,3) under the function f because 
f+ 2i) = (14+ 2i) +i=1+3i. We can see this in the 
figure to the right. 


Observe that any vertical line is mapped to itself under the 
function f. We can see this geometrically because given a 
vertical line in the Complex Plane, each point is just moved up 
one unit along that same vertical line. The vertical line in the 
figure on the right has equation x = 1. If we let L be the set of 
points on the line x = 1, then we see that f[L] = L. In fact, the function f maps L bijectively 
onto L. It might be more precise to say that f maps the vertical line x = 1 in the xy-plane to 
the vertical line u = 1 in the wv-plane. 


If a subset X of C satisfies f[X] S X, we will say that X is invariant under the function f. If 
f[X] = X, then we will say that X is surjectively invariant under f. So, in this example, we see 
that any vertical line L is surjectively invariant under f. 


A horizontal line, however, is not invariant under the function f. For example, the horizontal 
line y = 1 in the xy-plane is mapped bijectively to the horizontal line v = 2 in the wv-plane. 
We can visualize this mapping as follows: 


y V 
3 3 
v=z2 
2 
1 
<+—}—+—-+—_+—-}- 
-3 —2 -l 1 2 3 xX -3 —2 -I1 1 2 3 u 
= =]: 


In fact, for any “shape” in the xy-plane, after applying the function f, we wind up with the same 
shape shifted up 1 unit in the uv-plane. We can even think of this function as shifting the whole 
plane up 1 unit. More specifically, the image of the xy-plane under f is the entire uv-plane, 
where each point in the xy-plane is mapped to the point in the uv-plane that is shifted up 1 
unit from the original point. So, C is surjectively invariant under f. 

2. Let g(z) =Z. 


If we write z = x + yi, then we have g(x + yi) = x — yi. 
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So, u(x,y) = x and v(x, y) =-y. 


Geometrically, g is a reflection in the x-axis (or real axis). It y 
takes any point (x,y) in the Complex Plane and reflects it 3 
through the x-axis to the point (x, - y). For example, the point 


(1, 2) is reflected through the x-axis to the point (1,-2) under 2 (1,2) 
the function g because g(1 + 2i) = 1 — 2i. We can see this in 
the figure to the right. 1 


Observe that the x-axis is invariant under g. To see this, note 


that any point on the x-axis has the form (a, 0) for somea E C + + > 
and g(a + 0i) = a — 0i = a = a + Qi. Notice that g actually 0 2 3 xX 
maps each point on the x-axis to itself. Therefore, we call each $f 

point on the x-axis a fixed point of g. 

It’s not hard to see that the subsets of C that are invariant under —2 (1,-2) 


g are precisely the subsets that are symmetric with respect to 

the x-axis. However, points above and below the x-axis are not —3 
fixed points of g, as they are reflected across the x-axis. The 
figure below should help to visualize this. Note that in this 
example, invariant is equivalent to surjectively invariant. 


In the figure, the rectangle displayed is invariant under g. The fixed points of g in the rectangle 
are the points on the x-axis. We see that points below the x-axis in the xy-plane are mapped 
to points above the u-axis in the wv-plane. A typical point below the x-axis and its image under 
g above the u-axis are shown. Similarly, points above the x-axis in the xy-plane are mapped to 
points below the w-axis in the wv-plane. 


Let h(z) = iz. 
If we write z = x + yi, then we have h(x + yi) = i(x + yi) = xi + yi? = xi — y = -y + xi. 


So, the function h takes any point (x, y) to the point ( - y, x). To understand what this means 
geometrically, it is useful to analyze what the image looks like in exponential form. 


; . . T , mT. (047 
If we write z = re”, then we have h(re®) = i(re) = e'2(re!®) = re"ze? = rell Fa. 
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Notice that r remains unchanged under this 
transformation. So, h(z) is the same distance 
from the origin as z. However, the angle 
changes from 0 to 0 + = Geometrically, g is a 


rotation about the origin by = radians, or 


equivalently, 90°. As an example, the point 
(1,1) is rotated 90° about the origin to the 
point (- 1, 1) (see the figure to the right). We 
can see this in one of two ways. If we use the 
standard form of 1+i, then we have 
h(1 +i) =-1 + i. If we use exponential form, 
then by Example 15.4, 1+i= Vez, So, 


M; 7  T\. 31. 
h (v2e') = V2e(atz)! = /2e+%'. Therefore, 
we have u = V2 cos = v2(-=) =-1 and 


v = V2sin= = y2 (+) = 1. So, once again, g(1 + i) =-1 + 1i =-1+i. 


Observe that any circle centered at the origin is surjectively invariant under h and the only fixed 
point of h is the origin. 


Let p(z) =z’. 

; : a2 sai? ; 
If we write z = re, then we have p(re”) = (ret?) = r?(e!®)” = r?etC®) by De Moivre’s 
Theorem. 


Under this function, the modulus of the complex number z is squared and the argument is 
doubled. As an example, let’s see what happens to the point (1,1) under this function. 


Changing to exponential form, by Example 15.4, we have 1 +i = Jier., So, p(1 +i) = 2e2!, 
We see that the modulus of p(1 + i) is 2 and the argument of p(1 + i) is = So, in the Complex 
Plane, this is the point that is 2 units from the origin on the positive y-axis (because 
W E) = (0, 1) and (0, 1) lies on the positive y-axis). In standard form, we have p(1 + i) = 2i. 
The only fixed points of p are z = 0 and z = 1. To see this, note that if r7el(28) = ret? then 
r? = r and 20 = 0 + 2kr for some k € Z. The equation r? = r is equivalent to r? — r = 0 or 


r(r—1)=0. So, r=0 or r=1. If r= 0, then z= 0. So, assume r = 1. We see that 
20 = 0 + 2km is equivalent to 0 = 2km. So, z = 1- e!(2k™ = e? = 4, 


Observe that the unit circle is surjectively invariant under p. To see this, first note that if 
z = re}? lies on the unit circle, then r = 1 and p(e!®) = e2%, which also has modulus 1. 


, 0 N2 : 
Furthermore, every point z on the unit circle has the form z = e and p (e=) = (e5) = et? 
by De Moivre’s Theorem. 
What other subsets of C are surjectively invariant under p? Here are a few: 


e The positive real axis: {z E€ C | Rez > 0 A Imz = 0} 
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e The open unit disk: {z € C | |z| < 1} 
e The complement of the open unit disk: {z € C | |z| => 1} 


The dedicated reader should prove that these sets are surjectively invariant under p. Are there 
any other sets that are surjectively invariant under p? What about sets that are invariant, but 
not surjectively invariant? 


Limits and Continuity 


Let ACC, let f:A > C, let L EC, and let a E C be a point such that A contains some deleted 
neighborhood of a. We say that the limit of f as z approaches a is L, written lim f(z) = L, if for every 
Za 


positive number e, there is a positive number 6 such that 0 < |z—a| < 6 > |f(z) —L| < €. 


Notes: (1) The statement of this definition of limit is essentially the same as the statement of the e — ô 
definition of a limit of a real-valued function (see Lesson 13). However, the geometry looks very 
different. 


For a real-valued function, a deleted neighborhood of a has the form NE (a) = (a — €,a) U (a,at+ €) 
and we can visualize this neighborhood as follows: 


a—eE a 


For a complex-valued function, a deleted neighborhood of a, say 
N (a) = {z € C| 0 < |z — a| < e}, is a punctured disk with center a. 
We can see a visualization of such a neighborhood to the right. 


(2) In R, there is a simple one to one correspondence between 
neighborhoods (open intervals) and (vertical or horizontal) strips. 


In C there is no such correspondence. Therefore, for complex-valued 
functions, we start right away with the e — ô definition. 


(3) Recall that in R, the expression |x—aļ| <6 is equivalent to 
a—ô<x<a+ô,orxE(a-—ô,a +ô). 


Also, the expression 0 < |x — a| is equivalent tox — a #0, orx + a. 
Therefore, 0 < |x — a| < 6 is equivalent to x € (a—6,a) U (a,a + ô). 
In C, if we let z = x + yi and a = b + ci, then 
Iz —al = I(x + yi) — (b +ci)| = |@—b) + iE eT. 


So, |z — a| < 6 is equivalent to (x — b)? + (y — c)? < 6”. In other words, (x, y) is inside the disk with 
center (b, c) and radius 6. 
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Also, we have 
0<|z-al e(x-b)*+(y-c)* #0 Sx-b#00ry-c#0Sx#borx¥08zFA. 
Therefore, 0 < |z — a| < 6 is equivalent to “z is in the punctured disk with center a and radius ô.” 


(4) Similarly, in R, we have that |f (x) — L| < € is equivalent to f(x) € (L — e, L + €), while in C, we 
have |f (z) — L| < € is equivalent to “f (z) is in the disk with center L and radius e€.” 


(5) Just like for real-valued functions, we can think of determining if lim f(z) = L as the result of an 
Za 


e€ — ô game. Player 1 “attacks” by choosing a positive number e. This is equivalent to Player 1 choosing 
the disk N-(L) = {w E C | |w — L| < e}. 


Y 


Player 2 then tries to “defend” by finding a positive number ô. This is equivalent to Player 2 choosing 
the punctured disk N (a) ={zeC|0<|z-—aļ< ô}. 
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If Player 2 defends successfully, then Player 1 chooses a new positive number €’, or equivalently, a new 
neighborhood Na (L) = {w E C | |w — L| < e’}. If Player 1 is smart, then he/she will choose e’ to be 
less than e (otherwise, Player 2 can use the same ô). The smaller the value of €’, the smaller the 
neighborhood N’ (L), and the harder it will be for Player 2 to defend. Player 2 once again tries to 
choose a positive number 6’ so that FN e] c N(L). This process continues indefinitely. Player 1 
wins the e — 6 game if at some stage, Player 2 cannot defend successfully. Player 2 wins the e — 6 
game if he or she defends successfully at every stage. 


(6) If for a given e > 0, we have found a ô > 0 such that f[ne(@] c N,(L), then any positive number 
smaller than 6 works as well. Indeed, if 0 < ô’ <6, then N (a) Cc NE (a). It then follows that 


FING (@] S F[NE(@] S NCL). 


Example 15.8: Let’s use the e — 6 definition of limit to prove that lim _ (= + 2) =i. 
z=>3+6i \3 


Analysis: Given € > 0, we will find 6 > 0 so that 0 < |z — (3 + 6i)| < 6 implies 
First note that 


TER 


(2+2)-i| =| Gz+ 6 -+Ga| = Eigr- 6i-3)|= Ẹillz-3- 6il =z- 3 +61. 
ap; i 2) E i| < € is equivalent to |z — (3 + 6i)| < 3e. Therefore, 6 = 3€ should work. 


Proof: Let € > 0 and let ô = 3e. Suppose that 0 < |z — (3 + 6i)| < 6. Then we have 


i j l i a i 
(2+2) -i| =2lz- G+ 6i)| <48 =2(e) =e. 
Since € > 0 was arbitrary, we have Ve > 056 > 0 (0 <|z—(3+6i)|<6d-- \(=+ 2) — i| < e). 


Therefore, lim ic + 2) =i. oO 


Z>3+61 


Example 15.9: Let’s use the € — 6 definition of limit to prove that lim z? = - 1. 
z> 


Analysis: Given € > 0, we need to find 5 > 0 so that 0 < |z — i| < ô implies |z? — (- 1)| < e. First 
note that |z? — (-1)| = |z? + 1| = (z —- D(z + Ö| = |z — illz +i]. Therefore, |z? — (-1)| < € is 
equivalent to |z — illz +i| < e€. 


As in Example 13.9 from Lesson 13, |z — i| is not an issue because we’re going to be choosing ô so that 
this expression is small enough. But to make the argument work we need to make |z + i| small too. 
Remember from Note 6 above that if we find a value for ô that works, then any smaller positive number 
will work too. This allows us to start by assuming that 6 is smaller than any positive number we choose. 
So, let’s just assume that ô < 1 and see what effect that has on |z + il. 


Well, if 6 < 1 and 0 < |z — i| < ô, then |z + i| = |(z — i) + 2i| < |z — i| + |2i| < 1 +2 = 3. Here 


we used the Standard Advanced Calculus Trick (SACT) from Note 7 after Example 4.5 in Lesson 4, 
followed by the Triangle Inequality (Theorem 7.3), and then the computation |2i| = |2||i| = 2-1 = 2. 
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So, if we assume that 6 < 1, then |z? — (-1)| = |z — illz +i] < 5-3 = 36. Therefore, if we want to 
make sure that |z? — (-1)| < e, then is suffices to choose 6 so that 36 < e, as long as we also have 


ô < 1. So, we will let 6 = min {1,5}. 


Proof: Let € > 0 and let 6 = min {1,5} Suppose that 0 < |z — i| < 6. Then since 6 < 1, we have 
lz +iļ = (z-i) 4+2i| < |z—i| + [2i| = |z — i| + [2]|i| = |z —i] + 2 < 1+ 2 =3, and therefore, 
lz? —(-1)| = |z? + 1| =|@-DG+t+D|=(|z-illz+ il oes =€, 


Since € > 0 was arbitrary, we have Ve > 045 > 0 (0 < |z—i| < 6 > |z? — (-1)| < €). Therefore, 


lim z? = -1. o 
Zl 


Theorem 15.3: If lim f(z) exists, then it is unique. 
za 


Proof: Suppose that lim f(z) =L and lim f(z) =K. Let e > 0. Since lim f (z) = L, we can find 
6, > Osuchthat 0 < Iz = a| <6, > FE a L| < > Since lim f (z) =K, ee ô > Osuchthat 
0 < |z-—a| <6, > |f(z) -—K|< 2 Let 56 = min{6,,6,}. Suppose that 0 < |z—aļ|< ô. Then 
IL- KI = IF) — K) - F) - L) (SACT) < |C) — K1 + |f@) -LI (T) <ź+ź =e. Since e 


was an arbitrary positive real number, by Problem 8 from Lesson 5, we have |L —K| = 0. So, 
L— K = 0, and therefore, L = K. o 


Note: SACT stands for the Standard Advanced Calculus Trick and TI stands for the Triangle Inequality. 
2 Z 2 $ 
Example 15.10: Let’s show that lim () does not exist. 
Z> 
z 2 


a 2 
Proof: If we consider complex numbers of the form x + Oi, (2) = (=) = (=) = 1° = 1, Since 


Z 


every deleted neighborhood of 0 contains points of the form x + Oi, we see that if lim (2) exists, it 
Z- 
must be equal to 1. 


= i 


2 
Next, let’s consider complex numbers of the form x + xi. In this case, (2) = ( = Soe 


x-xi 
2 
Since every deleted neighborhood of 0 contains points of the form x + xi, we see that if lim (2) exists, 
Ana 


it must be equal to - 1. 
By Theorem 15.3, the limit does not exist. o 


Define d: C x C > R by d(z,w) = |z — w|. By Example 14.10 (part 1), (C, d) is a metric space. So, by 
Theorem 14.4, we have the following definition of continuity for complex-valued functions: 


Let A € C, let f: A > C, and let a E A be a point such that A contains some neighborhood of a. f is 
continuous at a if and only if for every positive number e, there is a positive number 6 such that 


Iz -a| < ô > |f@)—-f@l<e. 
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Example 15.11: Let f:C > C be defined by f(z) s242 In Example 15.8, we showed that 


lim f(z) =i. Since f(3 + 6i) = P +2 = 842 = 42 =i- 2+2 =i, we see from 


Zz>3+6i 
the proof in Example 15.8 that if |z — (3 + 6i)| < 6, then |f (z) — f (3 + 6i)| = (2+ 2) — i <e.lt 


follows that f is continuous at 3 + 6i. 


More generally, let’s show that for all a € C, f is continuous at a. 
Proof: Let a E C, let e > 0 and let 6 = 3e. Suppose that |z — a] < ô. Then we have 


If@) - f@| = |(F+ 2) -(E+2)| = @-@] = Éļiz-al < 55 =2(8e) =e. 


Since € > 0 was arbitrary, we have Ve > 0456 > 0 (|z — a| < ô > |f(z) — f(@)| < €). 
Therefore, f is continuous at a. o 


Notes: (1) We proved Ya E C Ye > 046 > 0 Yz E C(|z — a| < 6 > |f (z) — f (a)| < €). In words, we 
proved that for every complex number a, given a positive real number e€, we can find a positive real 
number 6 such that whenever the distance between z and a is less than 6, the distance between f (z) 
and f (a) is less than e. And of course, a simpler way to say this is “for every complex number a, f is 
continuous at a,” or Va € C (f is continuous at a).” 


(2) If we move the expression Va € C next to Vz € C, we get a concept that is stronger than continuity. 
We say that a function f: A > Cis uniformly continuous on A if 


Ve > 0356 >0Va,zEA(|z-—al <6 - |f(@ — f(a)| <e). 


(3) As a quick example of uniform continuity, let’s prove that the function f:C — C defined by 
f@= a 2 is uniformly continuous on C. 


New proof: Let € > 0 and let 6 = 3e. Let a,z E C and suppose that |z — a| < 6. Then we have 


If(z) — f(a)| = (=+2)-($+2)| = | e-a) =|; Jz —a| <48 ==-3e =e. 


Since € > 0 was arbitrary, we have Ve > 0456 >0Va,z€C(|z-—al < ô > |f( — f(a)| < ©). 
Therefore, f is uniformly continuous. 


(4) The difference between continuity and uniform continuity on a set A can be described as follows: 
In both cases, an € is given and then a ô is chosen. For continuity, for each value of a, we can choose a 
different 6. For uniform continuity, once we choose a 6 for some value of a, we need to be able to use 
the same 6 for every other value of a in A. 


In terms of disks, once a disk of radius € is given, we need to be more careful how we choose our disk 
of radius 6. As we check different z-values, we can translate our chosen disk as much as we like around 
the xy-plane. However, we are not allowed to decrease the radius of the disk. 
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The Riemann Sphere 


We have used the symbols - oo and œ (or +00) to describe unbounded intervals of real numbers, as 
well as certain limits of real-valued functions. These symbols are used to express a notion of “infinity.” 
If we pretend for a moment that we are standing on the real line at 0, and we begin walking to the 
right, continuing indefinitely, then we might say we are walking toward ©o. Similarly, if we begin walking 
to the left instead, continuing indefinitely, then we might say we are walking toward - oo. 


—2 -l 0 1 2 


We would like to come up with a coherent notion of infinity with respect to the Complex Plane. There 
is certainly more than one way to do this. A method that is most analogous to the picture described 
above would be to define a set of infinities {009|0 < @ < 27}, the idea being that for each angle @ in 
standard position, we have an infinity, 00g, describing where we would be headed if we were to start 
at the origin and then begin walking along the terminal ray of 8, continuing indefinitely. 


The method in the previous paragraph, although acceptable, has the disadvantage of having to deal 
with uncountably many “infinities.” Instead, we will explore a different notion that involves just a single 
point at infinity. The idea is relatively simple. Pretend you have a large sheet of paper balancing on the 
palm of your hand. The sheet of paper represents the Complex plane with the origin right at the center 
of your palm. The palm of your hand itself represents the unit circle together with its interior. 


Now, imagine using the pointer finger on your other hand to press down on the origin of that sheet of 
paper (the Complex Plane), forcing your hand to form a unit sphere (reshaping the Complex Plane into 
a unit sphere as well). Notice that the origin becomes the “south pole” of the sphere, while all the 
“infinities” described in the last paragraph are forced together at the “north pole” of the sphere. Also, 
notice that the unit circle stays fixed, the points interior to the unit circle form the lower half of the 
sphere, and the points exterior to the unit circle form the upper half of the sphere with the exception 
of the “north pole.” 


When we visualize the unit sphere in this way, we refer to it as the Reimann Sphere. 


Let’s let S? be the Reimann Sphere and let’s officially define the north pole and south pole of S? to be 
the points N = (0,0,1) and S = (0,0,-1), respectively. 
N = (0,0,1) 
Also, since S? is a subset of three-dimensional space (formally s? 
known as Rĉ), while C is only two dimensional, let’s identify C / T 
with C x {0} so that we write points in the Complex Plane as C eo D 
(a,b, 0) instead of (a, b). We can then visualize the Complex 
Plane as intersecting the Reimann sphere in the unit circle. To 
the right we have a picture of the Reimann Sphere together with S = (0,0,-1) 
the Complex Plane. 
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For each point z in the Complex Plane, consider the line passing through the points N and z. This line 
intersects S? in exactly one point P,. This observation allows us to define a bijection f:C > $? \ N 
defined by f(z) = P,. An explicit definition of f can be given by 

E+ zZ—-2 |z|/?— 7) 


pas l +z i+ |z Iz? +1 


Below is a picture of a point z in the Complex Plane and its image f(z) = P, on the Riemann Sphere. 


In Challenge Problem 21 below, you will be asked to verify that f is a homeomorphism. If we let 
C = CU {oœ}, then we can extend f to a function f: C > S? by defining f (%0) = N. C is called the 
Extended Complex Plane. If we let T consist of all sets U € C that are either open in C or have the 
form U = V U {oo}, where V is the complement of a closed and bounded set in C, then T defines a 
topology on C, and fi is a homeomorphism from (C, T) to (S2, Ug2), where U is the product topology 
on R? with respect to the standard topology on R. 


Note: Subspace and product topologies were defined in Problems 8 and 11 in Lesson 14. 


; ys 1. .. 1 
If € is a small positive number, then zisa large positive number. We see that the set Ni = {z | |z| > 2) 
E 


is a neighborhood of © in the following sense. Notice that N1 consists of all points outside of the circle 


E€ 


of radius - centered at the origin. The image of this set under f is a deleted neighborhood of N. 


We can now extend our definition of limit to include various infinite cases. We will do one example 
here and you will look at others in Problem 18 below. 


lim f(z) = œ if and only if Ve > 046 > 0 (0 <|z-—al<d- |f(@)| >). 


Theorem 15.4: lim n f(z) = = œ if and only lim ——~ = 0. 


zoaf = 


Proof: i lim f(z) = œ and lete > 0. There is ô > Osothat0 < |z—a| <6 > |f(z)| > =. But, 


If (z)| oa zis en ied to I= — o| < €. So, lim — = 0. 


f(z) za T 


= Qand let e > 0. There is ô > 0 sothat0 < |z — a| < ô > |+- 
a f(z) 


— o| < eis ie tolf (z) >= =, So, lim fiz) =e, Oo 


Now, suppose lim 35 o| < €. But, 


5 
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Problem Set 15 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 


LEVEL 1 


1. In Problems 11 and 12 below, you will be asked to show that w (=) = ES) and 


W (=) = (2. 5), Use this information to compute the sine, cosine, and tangent of each of the 


6 2 
following angles: 
(i) = 
(ii) 5 
(ii) + 
(iv) = 
vV = 
OE. 
vi) = 


a.n 117 
(viii) zF 


2. Use the sum identities (Theorem 15.1) to compute the cosine, sine, and tangent of each of the 
following angles: 


© = 
ji) 5 
Gii) = 
av = 
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LEVEL 2 


3. Each of the following complex numbers is written in exponential form. Rewrite each complex 
number in standard form: 


O e" 
Gin tee 
(iii) 3er 
(iv) 2e3! 
(yy Ves! 
(vi) me 4° 
Gives 


4. Each of the following complex numbers is written in standard form. Rewrite each complex 
number in exponential form: 


(i) -1-i 

(ii) V3 +i 

(iii) 1 — V3i 
OCs Cas) 


5. Write the following complex numbers in standard form: 
a ene 
@ (+ 7%) 


(ii) (1+ V3i)” 


LEVEL 3 


6. Use De Moivre’s Theorem to prove the following identities: 
(i) cos2@ = cos? 6 — sin? 0 
(ii) sin20 = 2sin@cos@ 
(iti) cos3@ = cos? 6 — 3 cos 0 sin? 0 
7. Suppose that z = re’? and w = se’? are complex numbers written in exponential form. Express 
each of the following in exponential form. Provide a proof in each case: 
G) zw 


(ii) = 
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8. Write each function in the form f(z) = u(x, y) + iv(x, y) and f(z) = u(r, 4) + iv(r, 8): 


© f(z)=2z2?-5 
(i) f@ =F 
(iii) f(z)=z2+274+2z4+1 


9. Let f(z) = x? — y? — 2x + 2y(x + 1)i. Rewrite f(z) in terms of z. 


10. Find all complex numbers that satisfy the given equation: 


(i) z&-1=0 
(i) z*+4=0 


LEVEL 4 


11. 


12. 


13. 


14. 


15. 


Consider triangle AOP, where O = (0, 0), A = (1, 0), and P is the point on the unit circle so that 
angle POA has radian measure = Prove that triangle AOP is equilateral, and then use this to prove 


2 
measures of a triangle sum to m radians; (ii) Two sides of a triangle have the same length if and 


only if the interior angles of the triangle opposite these sides have the same measure; (iii) If two 
sides of a triangle have the same length, then the line segment beginning at the point of 
intersection of those two sides and terminating on the opposite base midway between the 
endpoints of that base is perpendicular to that base. 


that W E) = (=). You may use the following facts about triangles: (i) The interior angle 


Prove that W (=) = (£, 2), You can use facts (i), (ii), and (iii) described in Problem 11. 


Let 8 and ¢ be the radian measure of angles A and B, respectively. Prove the following identity: 
cos(@ — @) = cos 8 cos ġ + sin 8 sing 
Let 8 and @ be the radian measure of angles A and B, respectively. Prove the following identities: 
(i) cos(@+¢@) = cos 8 cos ġ — sin 8 sing 
(ii) cos(mx — 0) = -cos 8 
eee TT P 
(iii) cos E — o) = sind 
. % TT = 
(iv) sin E = o) = cos 0 
(v) sin(@+ ¢) = sin cos ġ + cos 8 sin 
(vi) sin(r -— 0) = -sin 


Let z,w E C. Prove that arg zw = arg z + arg w in the sense that if two of the three terms in the 
equation are specified, then there is a value for the third term so that the equation holds. Similarly, 
prove that arg — = argz — argw. Finally, provide examples to show that the corresponding 


equations are false if we replace “arg” by “Arg.” 
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LEVEL 5 


16. Define the function f:C > C by f(z) =z*. Determine the images under f of each of the 
following sets: 


© A={xtyilx*-y* =1} 
Gi) B={xt+yilx >OAy>0A xy <1} 
(iii) C={x+yi|x=>O0Ay=0} 
(iv) D={x+yily = 0} 
17. Let ACC, let f:A > C, let L = j + ki E€ C, and let a = b + ci E€ C be a point such that A 


contains some deleted neighborhood of a. Suppose that f(x + yi) = u(x, y) + iv(x, y). Prove 


that lim f(z) = Lif and only if ge n y) = j and tae v(x, y) =k. 


18. Give a reasonable definition for each of the following limits (like what was done right before 
Theorem 15.4). L is a finite real number. 


(i) limf@=L 
Z—0o 

(ii) lim f(z) = œ 
Z—0o 

19. Prove each of the following: 

0O lim f(z) = Lif and only lim f (=) = L 

Z—00 z>0 Z 
1 


(ii) lim f(z) = œ if and only lim ~» = 0. 
me aTe 


20. Let f, g: R > R be defined by f (x) = cos x and g(x) = sin x. Prove that f and g are uniformly 
continuous on R. Hint: Use the fact that the least distance between two points is a straight line. 


CHALLENGE PROBLEM 


21. Consider C with the standard topology and S? with its subspace topology, where S* is being 
considered as a subspace of R?. Let f: C > S? \ N be defined as follows: 


Z+Z z—-z  |z\*-1 
f@) = 277 a 5]2)’ Io 
14+ |z|2’i(1 + |z| |z|2 +1 


Prove that f is a homeomorphism. 
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LESSON 16 - LINEAR ALGEBRA 
LINEAR TRANSFORMATIONS 


Linear Transformations 


Recall from Lesson 8 that a vector space over a field F is a set V together with a binary operation + on 
V (called addition) and an operation called scalar multiplication satisfying the following properties: 


(1) 
(2) 
(3) 
(4) 
(5) 
(6) 
(7) 
(8) 
(9) 


(Closure under addition) For all v.w EV,v +w EV. 

(Associativity of addition) For all v,w,u E V, (v +w) +u = v + (w +u). 

(Commutativity of addition) For allv,w E V, v +w =w +v. 

(Additive identity) There exists an element 0 E V such that for allv E V,0 +v=v+0=v. 
(Additive inverse) For each v E V, there is -v E V such that v + (- v) = (-v) +v = 0. 
(Closure under scalar multiplication) For allk € Fandv E V, kv EV. 

(Scalar multiplication identity) If 1 is the multiplicative identity of F and v E V, then 1v = v. 
(Associativity of scalar multiplication) For all j,k € F and v E V, (jk)v = j(kv). 
(Distributivity of 1 scalar over 2 vectors) For all k € F and v,w E V, k(v + w) = kv + kw. 


(10) (Distributivity of 2 scalars over 1 vector) For all j,k € F and v E V, (j + k)v = jv + kv. 


The simplest examples of vector spaces are Q”, R”, and C”, the vector spaces consisting of n-tuples of 
rational numbers, real numbers, and complex numbers, respectively. As a specific example, we have 
R? = {(x, y, z) | x,y,z € R} with addition defined by (x,y,z) + (s,t,u) = (x +s,y +t,z + u) and 
scalar multiplication defined by k(x, y,z) = (kx, ky, kz). Note that unless specified otherwise, we 
would usually consider R? as a vector space over R, so that the scalars k are all real numbers. 


Let V and W be vector spaces over a field F, and let T: V — W be a function from V to W. 


We say that T is additive if for all u, v E€ V, T (u + v) = T(u) + T (v). 


We say that T is homogenous if for all k € F andall v € V, T (kv) = kT (v). 


T is a linear transformation if it is additive and homogeneous. 


Example 16.1: 


1. 


Let V = W = C be vector spaces over R and define T:C > C by T(z) = 5z. We see that 
T(z+w) = 5(z+w) = 5z + 5w = T(z) +T(w). So, T is additive. Furthermore, we have 
T(kz) = 5(kz) = k(5z) = kT(z). So, T is homogenous. Therefore, T is a linear 
transformation. 


More generally, for any vector space V over R and any m E R, the function S: V —> V defined 
by S(v) = mv is a linear transformation. The verification is nearly identical to what we did in 
the last paragraph. This type of linear transformation is called a dilation. 
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Note that if m, b E R with b + 0, then the function R: V > V defined by R(v) = mv + b is not 
a linear transformation. To see this, observe that R(2v) = m(2v) +b = 2mv +b and 
2R(v) = 2(mv + b) = 2mv + 2b. If R(2v) =2R(v), then 2mv+b=2mv+2b, or 
equivalently, b = 2b. Subtracting b from each side of this equation yields b = 0, contrary to 
our assumption that b + 0. So, the linear functions that we learned about in high school are 
usually not linear transformations. The only linear functions that are linear transformations are 
the ones that pass through the origin (in other words, b must be 0). 


Let V = R* and W = R? be vector spaces over R and define T: R* > R? by 
T((x,y, Z, w)) = (x + z,2x — 3y, 5y — 2w). 
We have 
T((x,y, z,w) + (s,t,u, v)) z T((x +s,y+t,z+u,w + v)) 
= ((x +s)+ (z +u), 2(x+s)-— 30 40,5 4+t) -—2(wt v)) 
= ((x +z) + (s + u), (2x — 3y) + (2s — 3t), (Sy — 2w) + (5t — 2v)) 
= (x + z, 2x — 3y, 5y — 2w) + (s + u, 2s — 3t,5t — 2v) 
= T((x,y, Z, w)) + T((s, t,u, v)). 
So, T is additive. Also, we have 
T(k(x,y,z,w)) = T((kx, ky, kz, kw)) 
= (kx + kz,2(kx) — 3(ky), 5(ky) — 2(kw)) 
= (k(x + z),k(2x — 3y),k(5y — 2w)) 
= k(x + z,2x — 3y, 5y — 2w) = kT((x, Vids w)). 
So, T is homogenous. Therefore, T is a linear transformation. 
LetV = R? and W = R be vector spaces over Rand define T: R? > R by T((x, y)) = xy. Then 
T is not a linear transformation. Indeed, consider (1, 0), (0, 1) € R?. We have 
T((1,0) + (0,1)) = T((1,1))=1-1=1. 
T((1,0))+T((0,1))=1-0+0-1=0+0=0. 
So, T((1,0) + (0, 1)) + T(G, 0)) + T((0, 1)). This shows that T is not additive, and therefore, 
T is not a linear transformation. 


Observe that T is also not homogeneous. To see this, consider (1, 1) € IR? and 2 € R. We have 
T(2(1,1)) = T((2,2)) = 2 - 2 = 4, but 2T (1, 1) = 2(1-1)=2:1=2. 


In Problem 3 below, you will be asked to show that neither additivity nor homogeneity alone is enough 
to guarantee that a function is a linear transformation. 


Recall from Lesson 8 that if v,w E V and j,k E F, then jv + kw is called a linear combination of the 
vectors v and w with weights j and k. The next theorem says that a function is a linear transformation 
if and only if it “behaves well” with respect to linear combinations. 
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Theorem 16.1: Let V and W be vector spaces over a field F. A function T:V > W is a linear 
transformation if and only if for all v, w E V and all a,b E F, T(av + bw) = aT(v) + bT(w). 


Proof: Suppose that T:V — W is a linear transformation, let v,w E V, and let a,b E F. Since T is 
additive, T(av+bw)=T(av)+T(bw). Since T is homogenous, T(av)=aT(v) and 
T (bw) = bT(w). Therefore, T(av + bw) = T(av) + T(bw) = aT(v) + bT(w), as desired. 


Conversely, suppose that for all a,b € F, T(av+ bw) =aT(v) + bT(w). Let vw EV and let 
a=b=1. Then T(v+w) =T(1v+1w) =1T(v) +1T(w) =T(v)+T(w). Therefore, T is 
additive. Now, let v€V and kEF. Then T(kv) =T(kv + 0v) = kT(v) + 0T(v) =kT(v). 
Therefore, T is homogenous. It follows that T is a linear transformation. oO 


We can use induction to extend Theorem 16.1 to arbitrary linear combinations. If v € V can be written 
as a linear combination of vectors v4, V2, ..., Vn E V, then T(v) is determined by T(v,), T(v2),...T (Vn). 
Specifically, if v = cyv1 + C2V32 + +++ + Cy Vy, then we have 


T(v) = T(cy¥4 + C2V2 + + Cy Vn) = CYT (v1) + oT (V2) +- + caT (Up). 


In particular, if B = {v4, v2, ..., Vn} is a basis of V, then T is completely determined by the values of 
T(v,), T (v2), T n). 


Notes: (1) Recall from Lesson 8 that the vectors v4, V2, ..., Vn E V are linearly independent if whenever 
kivi + kav + + + knYVn = 0, it follows that all the weights ky, kz, ...,k, are 0. 


(2) Also, recall that the set of all linear combinations of v4, V2, ...,Vn E V is called the span of 
Vi, Vz, ++, Vn, Written span{ v1, V2, ..., Vn}. 


(3) The set of vectors {v4, Vz, ..., Vn} is a basis of V if v4, v2, ..., Vn are linearly independent and 
span{ v4, Vz, Vn} = V. 


In particular, if {v4, V2, ..., Vn} is a basis of V, then every vector in V can be written as a linear 
combination of v4, V2, ..., Vn- 


So, if we know the values of T (v1), T (v2), ..., T (Vn), then we know the value of T(v) for any v E V, as 
shown above. 


In other words, given a basis B of V, any function f: B > W extends uniquely to a linear transformation 
T:V >W. 


Let V and W be vector spaces over a field F. We define L(V,W) to be the set of all linear 
transformations from V to W. Symbolically, £(V,W) = {T:V —> W |T is a linear transformation}. 


Theorem 16.2: Let V and W be vector spaces over a field F. Then L(V, W) is a vector space over F, 
where addition and scalar multiplication are defined as follows: 
S+T €L(V,W) is defined by (S + T)(v) = S(v) + T(v) for S,T E€ L(V, W). 
kT € L(V, W) is defined by (kT)(v) = kT(v) forT E€ LWV,W) andk E F. 
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The reader will be asked to prove Theorem 16.2 in Problem 8 below. 


If V, W, and U are vector spaces over F, andT:V > W,S:W > U are linear transformations, then the 
composition S o T: V > U is a linear transformation, where S o T is defined by (S  T)(v) = S(T(v)) 
for all v E V. To see this, let v, w E V anda, b €E F. Then we have 


(S o T)(av + bw) = S(T(av + bw)) = S(aT (v) + bT(w)) 
= a(s(T(v))) + b (S(Tw))) = a(S ° T)(v) + B(S © T)(w). 
Example 16.2: Let T: R? > R? be the linear transformation defined by T((x,y)) = (x,x + y, y) and 


let S:R? > R? be the linear transformation defined by S((x,y,z)) = (z — y, x — z). Then 
S oT: RŻ > R? is a linear transformation and we have 


(SeT)((xy)) = S (T(&9))) = S(x + y,y)) = Cx, x = y). 


Notes: (1) In Example 16.2, the composition T o S: R? > R? is also a linear transformation and we have 
(T o S)((%,y,2)) = T (S((œ,y,2))) = T(z - yx = 2)) = Z- y, x - y, x — z). 


(2) In general, if T:V > W,S:X > U are linear transformations, then S o T is defined if and only if 
W = X. So, just because S o T is defined, it does not mean that T o S is also defined. For example, if 
T:R > R? and S: R? > R?, then S oT is defined and S ° T:R > R°. However, T o S is not defined. 
The “outputs” of the linear transformation S are ordered triples of real numbers, while the “inputs” of 
the linear transformation T are real numbers. They just don’t “match up." 


(3) If S and T are both linear transformations from a vector space V to itself (that is S,T: V — V), then 
the compositions S o T and T © S will both also be linear transformations from V to itself. 


By Note 3 above, in the vector space L(V,V), we can define a multiplication by ST = S oT. This 
definition of multiplication gives L(V, V) a ring structure. In fact, with addition, scalar multiplication, 
and composition as previously defined, £(V,V) is a structure called a linear algebra. 


A linear algebra over a field F is a triple (A, +, -), where (A, +) is a vector space over F, (A, +, -)isa 
ring, and forall u,v € A and k E F, k(uv) = (ku)v = u(kv). 


We will call the last property “compatibility of scalar and vector multiplication.” 


Notes: (1) There are two multiplications defined in a linear algebra. As for a vector space, we have 
scalar multiplication. We will refer to the ring multiplication as vector multiplication. 


(2) Recall from Lesson 4 that a ring (A, +, -) satisfies the first 5 properties of a vector space listed above 
(with A in place of V) together with the following three additional properties of vector multiplication: 
e (Closure) Forallu,v€ A,u-veA. 
e (Associativity) For all u,v,w E A, (u-v)-w=u-:(v-w). 


e (Identity) There exists an element 1 E A such that forallv EA, 1-v=v-1l=v. 
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Example 16.3: 


1. (R,+, -) isalinear algebra over R, where addition and multiplication are defined in the usual 
way. In this example, scalar and vector multiplication are the same. 


2. Similarly, (C, +, -) is a linear algebra over C, where addition and multiplication are defined in 
the usual way (see Lesson 7). Again, in this example, scalar and vector multiplication are the 
same. 


3. If V is a vector space over a field F, then L(V, V) is a linear algebra over F, where addition and 
scalar multiplication are defined as in Theorem 16.2, and vector multiplication is given by 
composition of linear transformations. You will be asked to verify this in Problem 9 below. 


Recall from Lesson 10 that a function f: A > B is injective if a,b € A anda + b implies f(a) # f(b). 
Also, f is surjective if for all b € B, there isa € A with f(a) = b. A bijective function is one that is both 
injective and surjective. 


Also recall that a bijective function f is invertible. The inverse of f is then the function f~*:B > A 
defined by f~1(b) = “the unique a E A such that f(a) = b.” 


By Theorem 10.6 from Lesson 10, f~t o f =i, and f of ~* = ip, where i, and ig are the identity 
functions on A and B, respectively. Furthermore, f~* is the only function that satisfies these two 
equations. Indeed, if h: B > A also satisfies h o f =i, and f oh = ig, then 


h=heip=ho(fof)=(hofjoft=igoft =f. 


A bijection T: V > W that is also a linear transformation is called an isomorphism. If an isomorphism 
T:V —> W exists, we say that V and W are isomorphic. As is always the case with algebraic structures, 
isomorphic vector spaces are essentially identical. The only difference between them are the “names” 
of the elements. Isomorphisms were covered in more generality in Lesson 11. 


If a bijective function happens to be a linear transformation between two vector spaces, it’s nice to 
know that the inverse function is also a linear transformation. We prove this now. 


Theorem 16.3: Let T: V > W be an invertible linear transformation. Then T~!:W > V is also a linear 
transformation. 


Proof: Let T: V > W be an invertible linear transformation, let u, v E W, and let a,b E F. Then by the 
linearity of T, we have 


T(aT~1(u) + bT~*(v)) = aT(T"1(u)) + bT(T 1 (v)) = au + bv. 


Since T is injective, aT~*(u) + bT~1+(v) is the unique element of V whose image under T is au + bv. 
By the definition of T~?, Tt (au + bv) = aT~*(u) + bT“1(v). o 
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Example 16.4: 


1. Let V = W = C be vector spaces over R and define T: C > C by T(z) = 5z, as we did in part 1 
of Example 16.1. if z # w, then 5z # 5w, and so T is injective. Also, if w € C, then we have 


T w) = 5 (w) = w. So, T is surjective. It follows that T is invertible and that the inverse of 


T is defined by T~+(z) = z, By Theorem 16.3, Tt: C > C is also a linear transformation. In 


the terminology of Lesson 11, T is an automorphism. In other words, T is an isomorphism from 
C to itself. 


2. Let V be a vector space over a field F with basis {v4, V2, V3}. Then let T: V > F? be the unique 
linear transformation such that T(v,) = (1, 0,0), T(v2) = (0,1,0), and T(v3) = (0,0,1). In 
other words, if v E V, since {v4, V2, v3} is a basis of V, we can write v = cC1V1 + C2V3 + C3V3, 
and T is defined by T (v) = cT (v1) + cT (v2) + c3T (v3) = (c1, C2, C3). 


To see that T is injective, suppose that T(c1v1 + C2V2 + C3V3) = T (div + dzV2 + d3v3). 
Then (c1, c2, C3) = (d}, dz, dz). It follows that c, = d4, cz = d, and cz = d3. Therefore, 
C1V1 + C2V2 + C3V3 = divi + dV + d3V3 and so, T is injective. 


Now, if (a,b,c) € F?, then T (avı + bv, + cv3) = (a,b,c) and so, T is surjective. From this 
computation, we also see that T~1: F? > V is defined by T~1((a,b,c)) = av, + bv; + cv3. 


It follows that T: V > F? is an isomorphism, so that V is isomorphic to Fè. 


Essentially the same argument as above can be used to show that if V is a vector space over a 
field F with a basis consisting of n vectors, then V is isomorphic to F”. 


Matrices 


Recall from Lesson 8 that for m,n E Zt, an m x n matrix over a field F is a rectangular array with m 

wd l i 2-5Si z 

rows and n columns, and entries in F. For example, the matrix H = 5 is a 
-1 v3 7+i 

2 x 3 matrix over C. We will generally use a capital letter to represent a matrix, and the corresponding 

lowercase letter with double subscripts to represent the entries of the matrix. We use the first subscript 


for the row and the second subscript for the column. Using the matrix H above as an example, we see 
that hii = i, hi2 =2- 5i, hi3 = =, hy = -1, hy F V3, and h23 =7+i. 


If Aisan m xn matrix, then we can visualize A as follows: 


Qi 00°" =| 


Ami ` Amn 


We let ME, be the set of all m Xx n matrices over the field F. Recall that we add two matrices 
A,B € Minn to get A + B € Mi, using the rule (a + b);; = aij + bij. We multiply a matrix A € Minn 
by a scalar k € F using the rule (ka);; = ka;j. We can visualize these computations as follows: 


a11 + big + Aint bin 


+ 


a “Ain bi ot a 


Ami ` Amn bmi ie bmn amı + bmi or Amn + bmn 
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Q11 eee Qin kay, eee kain 
Ami |“ amn kami aon kamn 
With these operations of addition and scalar multiplication, MEn is a vector space over F. 


We would now like to turn ME, into a linear algebra over F by defining a vector multiplication in MEn. 
Notice that we will not be turning all vector spaces MẸ into linear algebras. We will be able to do this 
only when m = n. That is, the linear algebra will consist only of square matrices of a specific size. 


We first define the product of an m x n matrix with an n x p matrix, where m,n,p are positive 
integers. Notice that to take the product AB we first insist that the number of columns of A be equal 
to the number of rows of B (these are the “inner” two numbers in the expressions “m X n” and 
“n x p”). 


So, how do we actually multiply two matrices? This is a bit complicated and requires just a little practice. 
Let’s begin by walking through an example while informally describing the procedure, so that we can 
get a feel for how matrix multiplication works before getting caught up in the “messy looking” 
definition. 


Let A = | >|anas =| : 


has 2 columns and B has 2 rows, we will be able to multiply the two matrices. 


|. Notice that A is a 2 X 2 matrix and B is a 2 X 3 matrix. Since A 


For each row of the first matrix and each column of the second matrix, we add up the products entry 
by entry. Let’s compute the product AB as an example. 


_fO 1) 71 2 OJ_[* y z 
AB =|} al f 3 =k v w 
Since x is in the first row and first column, we use the first row of A and the first column of B to get 


x = [0 i}[5| =0:1+1:0=0+0=0. 


Since u is in the second row and first column, we use the second row of A and the first column of B to 
= 1]_3, M 
getu = [3 21|] = 3 +203. 


The reader should attempt to follow this procedure to compute the values of the remaining entries. 
The final product is 


_fO 3 6 
= a 
Notes: (1) The product of a 2 x 2 matrix and a2 X 3 matrix is a 2 X 3 matrix. 


(2) More generally, the product of an m x n matrix and an n x p matrix is an m X p matrix. Observe 
that the inner most numbers (both n) must agree, and the resulting product has dimensions given by 
the outermost numbers (m and p). 
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We formally define matrix multiplication as follows. Let A be the m x n matrix A = 


ayy 0°" j 


Ami `“ Amn 
bı = bip 
and let B bethe n x p matrix B = | : : |. We define the product AB to be the m x p matrix 
bay = bnp 
Cia U Cip 
c=] : : [sunna 
Cray = Cmo 


n 


Cig = Qibij + aizb2j + + ainbnj = ` ikbgj. 
k=1 


Notes: (1) The symbol È is the Greek letter Sigma. In mathematics, this symbol is often used to denote 
a sum. È is generally used to abbreviate a very large sum or a sum of unknown length by specifying 
what a typical term of the sum looks like. Let’s look at a simpler example first before we analyze the 
more complicated one above: 
5 
» 542 499? 49? 4a 4h 4 O16 4.05 = 85, 
k=1 


The expression “k = 1” written underneath the symbol indicates that we get the first term of the sum 
by replacing k by 1 in the given expression. When we replace k by 1 in the expression k?, we get 17. 


For the second term, we simply increase k by 1 to get k = 2. So, we replace k by 2 to get k? = 27. 


We continue in this fashion, increasing k by 1 each time until we reach the number written above the 
symbol. In this case, that is k = 5. 


(2) Let’s now get back to the expression that we’re interested in. 


n 


Cij = 2 Aix Dey = Qibij + aizb2j + + Ainbnj 
k=1 


Once again, the expression “k = 1” written underneath the symbol indicates that we get the first term 
of the sum by replacing k by 1 in the given expression. When we replace k by 1 in the expression 
Aix byj, we get A;1b,;. Notice that this is the first term of c;;. 


For the second term, we simply increase k by 1 to get k = 2. So, we replace k by 2 to get ajzb2;. 


We continue in this fashion, increasing k by 1 each time until we reach the number written above the 
symbol. In this case, that is k = n. So, the last term is ainbnj. 


(3) In general, we get the entry c;j in the ith row and jth column of C = AB by “multiplying” the ith 
row of A with the jth column of B. We can think of the computation like this: 
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bij 


bo; 
[an aiz Gin] i = di1bıj + aizb2j ++ + Ainbnj 


by 


Notice how we multiply the leftmost entry aj, by the topmost entry b,;. Then we move one step to 
the right to aiz and one step down to bz; to form the next product, ... and so on. 


It is fairly straightforward to verify that with our definitions of addition, scalar multiplication, and matrix 
multiplication, for each n € Z*, ME, is a linear algebra over F. | leave this as an exercise for the reader. 
Note that it is important that the number of rows and columns of our matrices are the same. Otherwise, 
the matrix products will not be defined. 


Example 16.5: 


5 
i [i 2S: 4): E- =[1-5+2-14+3(-2)+4-3] =[5+2—-6412] = [13]. 
3 


5 
We generally identify a 1 x 1 matrix with its only entry. So, [1 2 3 4]- E = 15. 
3 


5 10 15 20 


5 

1 |1 2 3 4 
2-523 4S) 5 oF oe wal 
3 


3 6 9 12 


5 5 

Notice that[1 2 3 4]- S- + E- -[1 2 3 4], and in fact, the two products do not even 
3 3 

have the same size. This shows that if AB and BA are both defined, then they do not need to 


be equal. 

fo a al=loa3 o+al-& ol 
0 2) 71 27 _/O0+0 0+2]_ 7/0 2 
È all i=l 40 poe le | al: 
Notice that [7 TE Jal Ti d 


This shows that even if A and B are square matrices of the same size, in general AB + BA. So, 
matrix multiplication is not commutative. ME, is a noncommutative linear algebra. 


The Matrix of a Linear Transformation 


Let TEL(V,W) and let B = {v4, vz, ..., Vn} and C = {wy,W2,...,Wm} be bases of V and W, 
respectively. Recall that T is completely determined by the values of T(v,),T(v2),...,T(Up). 
Furthermore, since T(v,), T(v2),....T (Vn) E W and C is a basis for W, each of T (v1), T (v2), ..., Tp) 
can be written as a linear combination of the vectors in C. So, we have 
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T (V1) = Qy1Wy + a21W2 +++ AmiWm 


T (v2) = Q42W, + a22W2 + ©: + Am2Wm 
T(v;) = A, jWy + a2jW2 + + AnjWm 


T (vp) = AynW, + AgnW2 + t + AmnWm 


Here, we have aij E F for each i = 1,2, ...,m and j = 1,2,...,n. We form the following matrix: 


Qi ` Qin 


Ami `“ Amn 
Mr (B, C) is called the matrix of the linear transformation T with respect to the bases B and C. 


Note: The coefficients in the expression T(v;) = Q1 jW1 + a2jW2 +: + AnjWm become the jth 
column of Mr (B, C). Your first instinct might be to form the row [arj Gia Gel) but this is incorrect. 
Pay careful attention to how we form Mr (B, C) in part 2 of Example 16.6 below to make sure that you 
avoid this error. 


Example 16.6: 


1. Consider the linear transformation T: C —> C from part 1 of Example 16.1. We are considering 
C as a vector space over R and T is defined by T(z) = 5z. Let’s use the standard basis for C, so 
that B = C = {1 + 0i, 0 + 1i} = {1, i}. We have 


rajes S544 04 
T(i) =5i=0-145-i 
The matrix of T with respect to the standard basis is Mr ({1, i}, {1, i}) = [° : . 


In this case, since T is being mapped from a vector space to itself and we are using the same 
basis for both “copies” of C, we can abbreviate Mr ({1, i}, {1, i}) as Mr ({1, i}). Furthermore, 
since we are using the standard basis, we can abbreviate Mr ({1, i}, {1, i}) even further as My. 


So, we can simply write Mr = F 0 ; 
0 5 
a 
Now, letz = a + bi € C and write z as the column vector z = p We have 


_f5 O77a] 5a L ANO eL 
My -2=[} d= les =5[,|=52=T@). 
So, multiplication on the left by Mr gives the same result as applying the transformation T. 


2. Consider the linear transformation T: R* > R? from part 2 of Problem 16.1. We are considering 
IR* and R? as vector spaces over R and T is defined by 


T((x,y, Z, w)) = (x + z,2x — 3y, 5y — 2w). 
Let’s use the standard bases for R4 and R?, so that 


B = {(1,0,0,0), (0, 1,0, 0), (0, 0, 1,0), (0,0,0,1)} and C = {(1,0,0), (0, 1,0), (0, 0, 1)}. 
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We have 
T((1,0,0,0)) = (1,2, 0) 
T((0,1,0,0)) = (0,-3,5) 
T((0,0,1,0)) = (1,0, 0) 
T((0,0,0,1)) = (0,0,-2) 


1 0 1 0 
The matrix of T with respect to the standard bases is Mr = F -3 0 o 
0 5 0-2 


Once again, we abbreviate Mr (B,C) as Mr because we are using the standard bases. 


x 
Now, let v = (x, y, Z,w) E€ Rf and write v as the column vector v = : . We have 
w 
1 01 Qg” x+2z 
Mr-v=|2 -3 0 O}|?|=|2x-3y]=T). 
0 5 0-2 5y — 2w 


So, once again, multiplication on the left by Myr gives the same result as applying the 
transformation T. 


Let V be a vector space over F with a finite basis. Then we say that V is finite-dimensional. If 
B = {v4, V2, ..., Vn}, then by Problem 12 from Lesson 8, all bases of V have n elements. In this case, we 
say that V is n-dimensional, and we write dim V = n. 


Theorem 16.4: Let V be an n-dimensional vector space over a field F. Then there is a linear algebra 
isomorphism F: L(V, V) > MEn 


You will be asked to prove Theorem 16.4 in Problem 15 below. 


Images and Kernels 


Let T:V > W be a linear transformation. The image (or range) of T is the set T[V] = {T(v) |v e V} 
and the kernel (or null space) of T is the set ker(T) = {v € V | T(v) = 0}. 


Example 16.7: Let T: Rt > R? be defined by T((x,y,z,w)) = (x + y,x — z,x + 2w). Let’s compute 
T[R*] and ker(T). First, T[IR*] consists of all vectors of the form 


(x + y,x —z,x+2w) = (x + y)(1,0,0) + (x — z)(0,1,0) + (x + 2w)(0, 0, 1) 


F 1 
So, if (v4, V2, V3) E R3, let x = 0, y = v4, Z = - v, and w = 5 U3: Then we see that 


(x + y)(1,0,0) + (x — z)(0,1, 0) + (x + 2w)(0, 0,1) 
= v,(1,0,0) + v2(0, 1, 0) + v3(0,0,1) = (14, v2, V3) 
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Therefore, R? & T[R‘*]. Since it is clear that T[IR*] € R?, we have T[R*] = R?. 


Now, (x, y,z,w) € ker(T) if and only if (x + y,x —z,x + 2w) = (0,0,0) if and only if x+y = 0, 
x—-—z=0, and x+2w=0 if and only if y=-x, z=x, and w=-> if and only if 


(x, y,Z,w) = (x,-x,x,-3) = x(L-1 1,-5). 
So, every element of ker(T) is a scalar multiple of (1, -1,1,- z), Thus, ker(T) © span (i -1,1,- =). 


Conversely, an element of span {(1,-1, 1,-2)} has the form (v,-v,v,-50), and we have 


T ((v.-», v,-iv)) = (v —v,v—-—v,vt+2 (-2»)) = (0,0,0). So, span {(4,- 1, 1,-3)} C ker(T) 
Therefore, ker(T) = span l -1,1, -2)). 


Notice that T[IR*] is a subspace of R? (in fact, T[IR*] = R?) and ker(T) is a subspace of Rt. Also, the 
sum of the dimensions of T[IR*] and ker(T) is 3 + 1 = 4, which is the dimension of Rt. None of this is 
a coincidence, as we will see in the next few theorems. 


Theorem 16.5: Let V and W be vector spaces over afield F and let T: V —> W be a linear transformation. 
Then T[V] < W. 


Proof: We have T(0) = T(0 + 0) = T(0) + T(O). Therefore, T(0) = 0. It follows that 0 € T[V]. 


Let w,t ET[V]. Then there are u,v EV with T(u) =w and T(v) =t. It then follows that 
T(ut+v) =T(Ww)+T(v) =wt+t.So,wt+teETI[V]. 


Let wET[V] and k E F. Then there is u E€ V with T(u) =w. We have T(ku) = kT(u) = kw. 
Therefore, kw E T[V]. 


By Theorem 8.1 from Lesson 8, T[V] < W. Oo 


Theorem 16.6: Let V and W be vector spaces over a field F and let T: V — W bea linear transformation. 
Then ker(T) < V. 


Proof: As in the proof of Theorem 16.5, we have T(0) = 0. So, 0 € ker(T). 

Let u,v E ker(T). Then T(u + v) = Tu) + T(v) = 0+0=0.S0,ut+ v E ker(T). 

Let u € ker(T) and k E F. Then T(ku) = kT(u) = k - 0 = 0. Therefore, ku €E ker(T). 

By Theorem 8.1 from Lesson 8, ker(T) < V. Oo 


Theorem 16.7: Let V and W be vector spaces over afield F and let T: V — W be a linear transformation. 
Then ker(T) = {0} if and only if T is injective. 
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Proof: Suppose that ker(T) = {0}, letu, v E€ V, and let T(u) = T(v). Then T(u) — T(v) = 0. It follows 
that T(u — v) = T(u) —T(v) = 0. So, u — v E ker(T). Since ker(T) = {0}, u — v = 0. Therefore, 
u = v. Since u,v E V were arbitrary, T is injective. 


Conversely, suppose that T is injective, and let u E€ ker(T). Then T(u) = 0. But also, by the proof of 
Theorem 16.5, T(0) = 0. So, T(u) = T(O). Since T is injective, u = 0. Since u E V was arbitrary, 
ker(T) S {0}. By the proof of Theorem 16.5, T(0) = 0, so that 0 € ker(T), and so, {0} S ker(T). It 
follows that ker(T) = {0}. Oo 


If V and W are vector spaces over a field F, and T: V > W is a linear transformation, then the rank of 
T is the dimension of T[V] and the nullity of T is the dimension of ker(T). 


Theorem 16.8: Let V and W be vector spaces over a field F with dim V = n and let T:V —> W bea 
linear transformation. Then rank T + nullity T = n. 


Note: Before proving the theorem, let’s observe that in a finite-dimensional vector space V, any vectors 
that are linearly independent can be extended to a basis of V. 


To see this, let v4, V2, ... Vg be linearly independent and let u4, Uz, ..., Um be any vectors such that 
span{u,, U2, ...,Um} = V. We will decide one by one if we should throw in or exclude each Uj. 


ea . = _ Bo ifu, E span Bo. 
Specifically, we start by first letting Bọ = {v1, V2, ... Vgł and then B, = A uthi du, Espia Be, 
Bj-1 ifu; € span Bj-1. 


In general, for each j = 1,2,...m, we let B; = | By Problem 6 from 


Bj- U {u;} ifu; € span Bj-. 
Lesson 8, for each j, B; is linearly independent. Since for each j, uj E span B; and B; © By, 
V = span{uy, uz, ..., Um} = span Bm. Therefore, Bm is a basis of V. 


Proof of Theorem 16.8: Suppose nullity T = k, where 0 < k < n. Then there is a basis {v4, V2, ..., Vg} 
of ker(T) (note that if k = 0, this basis is the empty set). In particular, the vectors v4, V2, ..., Vg are 
linearly independent. By the note above, we can extend these vectors to a basis B of V, let’s say 
B = {v4, V2, ... , Vk, Uy, Uz, ++) Um}. SO, we have n = k + m. Let’s show that {T(u,), T (u2), ..., Tum) } 
is a basis of T[V]. 


For linear independence of T(u,),T(uz),...,7(Uum), note that since T is a linear transformation, 
CT (uy) + CoT (Uz) ++ + CT (Um) = 0 is equivalent to T(cyu, + Coz + °° + CmUm) = 0, which is 
equivalent to cyu, + C2Uz + ° + CmUm E ker(T). Since {v1, V2, ..., Vx} is a basis of ker(T), we can 
find weights d,, dz, ...,d, such that cyu, + CgUz + ° + CmUm = dV, + dV + + + dyV,. Since B is 
a basis of V, all weights (the c;’s and d,’s) are 0. So, T (u1), T (u2), ..., T(Um) are linearly independent. 


To see that T[V] = span{T (u1), T (uz), ..., Tum) }, let v E V. Since B is a basis of V, we can write v as 
a linear combination v = cyVy + C2V2 +: CkVg + daua + dguz +++: + dmUm. Applying the linear 
transformation T gives us 
T(v) = T(cyvy + 202g +e + CkVg + dy Uy + dgug +++ dmUm) 
= cT (v1) + cT (V2) + + CRT (VR) + dT (Uy) + dT (uz) + © + dmT (Um) 
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Note that T (v1), T (v2), ..., T (vz) are all 0 because v4, V3, ..., Vg E ker(T). 


Since each vector of the form T (v) can be written as a linear combination of T (w1), T (u2), ..., T (um), 
we have shown that T[V] = span{T (u), T (uz), ..., T(Um)}- 


Since T (u1), T (u2), ..., T (unm) are linearly independent and T[V] = span{T (u1), T (u2), ..., T (um)}, it 
follows that {T (u1), T (uz), ..., T (um )} is a basis of T[V]. Therefore, rank T = m. m 


Eigenvalues and Eigenvectors 
We now restrict our attention to linear transformations from a vector space to itself. For a vector space 
V, we will abbreviate the linear algebra L(V, V) by L(V). 
If U < V, we say that U is invariant under T € L(V) if T[U] E U. 


Example 16.8: Let V be a vector space and let T € L(V). 
1. {0} is invariant under T. Indeed, T (0) = 0 by the proof of Theorem 16.5. 
2. V is invariant under T. Indeed, if v E€ V, thenT(v) E V. 
3. ker(T) is invariant under T. To see this, let v € ker(T). Then T(v) = 0 € ker(T). 
4. T[V] is invariant under T. To see this, let w € T[V]. Then T (w) is clearly also in T[V]. 
Let V be a vector space over a field F. We call a subspace U < V a simple subspace if it consists of all 


scalar multiples of a single vector. In other words, U is simple if there is a u €V such that 
U = {ku |k E F}. 


Theorem 16.9: Let V be a vector space over a field F, let U = {ku | k € F} be a simple subspace of V, 
and let T € L(V). Then U is invariant under T if and only if there is A € F such that T (u) = Au. 


Proof: Suppose that U = {ku | k € F} is invariant under T. Then T (u) E U. It follows that T (u) = Au 
for some 2 E F. 


Conversely, suppose there is A € F such that T (u) = Au. Let v € U. Then there is k € F such that 
v = ku. Then T (v) = T (ku) = kT (u) = k(Au) = (kA)u E U. Since v € U was arbitrary, T[U] & U. 


Therefore, U is invariant under T. o 


Let V be a vector space over a field F and let T € L(V). A scalar å € F is called an eigenvalue of T if 
there is a nonzero vector v E V such that T (v) = Av. The vector v is called an eigenvector of T. 


Notes: (1) If v is the zero vector, Then T(v) = T(0) = 0 = 4- 0 for every scalar A. This is why we 
exclude the zero vector from being an eigenvector. An eigenvector must be nonzero. 


(2) If we let I: V —> V be the identity linear transformation defined by I(v) = v for all v € V, then we 
can write Av as AI (v). So, the equation T(v) = Av is equivalent to the equation (T — AID) (v) = 0. 


(3) It follows from Note 2 that A is an eigenvalue of T if and only if ker(T — AI) + {0}. By Theorem 
16.7, A is an eigenvalue of T if and only if T — AI is not injective. 
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(4) By Note 2, v is an eigenvector of T corresponding to eigenvalue A if and only if v is a nonzero vector 
such that (T — AI)(v) = 0. So, the set of eigenvectors of T corresponding to A is ker(T — Al). By 
Theorem 16.6, ker(T — AI) is a subspace of V. We call this subspace the eigenspace of V corresponding 
to the eigenvalue A. 


Example 16.9: 


1. Let V be any vector space over a field F and let J: V —> V be the identity linear transformation. 
Then for any v E V, I(v) = v = 1v. So, we see that 1 is the only eigenvalue of J and every 
nonzero vector v E V is an eigenvector of I for the eigenvalue 1. 


2. More generally, if k E F, then the linear transformation kl satisfies (kI)(v) = kI(v) = kv for 
all v E V. So, we see that k is the only eigenvalue of kI and every nonzero vector v E V is an 
eigenvector of kI for the eigenvalue k. 


3. Consider CÊ as a vector space over C and define T: C? > C? by T((z, w)) = (-w,z). Observe 

that A =i is an eigenvalue of T with corresponding eigenvector (1,-i). Indeed, we have 
T((1,-i)) = (i, 1) and i(1,- i) = (i, - i?) = (i, 1). So, T((1,- D) = iG, - i). 
Let’s find all the eigenvalues of this linear transformation. We need to solve the equation 
T((z, w)) = A(z,w), or equivalently, (- w, Zz) = (Az, Aw). Equating the first components and 
second components gives us the two equations -w = Az and z = Aw. Solving the first equation 
for w yields w = - AZ. Substituting into the second equation gives us z = A(-Az) = - Az. So, 
z+ A*z = 0. Using distributivity on the left-hand side of this equation gives z(1 + 4?) = 0. So, 
z= Q0 or 1+2? = 0. If z = 0, then w = -4-0 = 0. So, (z,w) = (0,0). Since an eigenvector 
must be nonzero, we reject z = 0. The equation 1 + A? = 0 has the two solutions 2 = i and 
A = -i. These are the two eigenvalues of T. 


Next, let’s find the eigenvectors corresponding to the eigenvalue A = i. In this case, we have 
T((z, w)) = i(z,w), or equivalently, (- w,z) = (iz,iw). So, -w = iz and z = iw. These two 
equations are actually equivalent. Indeed, if we multiply each side of the second equation by i, 
we get iz = i*w, or equivalently, iz = -w or -w = iz. 


So, we use only one of the equations, say -w = iz, or equivalently, w =- iz. So, the 
eigenvectors of T corresponding to the eigenvalue A = i are all nonzero vectors of the form 
(z, - zi). For example, letting z = 1, we see that (1, - i) is an eigenvector corresponding to the 
eigenvalue A = i. 


Let’s also find the eigenvectors corresponding to the eigenvalue A = -i. In this case, we have 
T((z, w)) = -i(z, w), or equivalently, (- w, z) = (- iz,- iw). So, -w = -iz and z = - iw. Once 
again, these two equations are equivalent. Indeed, if we multiply each side of the second 
equation by - i, we get - iz = i*w, or equivalently, - iz = -w or -w = - iZ. 


So, we use only one of the equations, say -w = -iz, or equivalently, w = iz. So, the 
eigenvectors of T corresponding to the eigenvalue 2 = -i are all nonzero vectors of the form 
(z, zi). For example, letting z = 1, we see that (1, i) is an eigenvector corresponding to the 
eigenvalue A = - i. 
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Note that if we consider the vector space R? 
over the field R instead of C? over C, then the 
linear transformation T:R? —> R? defined by 
T((z,w)) = (-w,z) has no eigenvalues (and 
therefore, no eigenvectors). Algebraically, this 
follows from the fact that 1 + A? = 0 has no 
real solutions. 


It is also easy to see geometrically that this 
transformation has no eigenvalues. The given 
transformation rotates any nonzero point 
(z,w) € R? counterclockwise by 90°. Since no 
multiple of (z, w) results in such a rotation, we 
see that there is no eigenvalue. The figure to 
the right shows how T rotates the point (1, 1) 
counterclockwise 90° to the point (- 1, 1). 


Let V be a vector space over a field F, let v4, V2, ...,U¥, E V, and ky, k2, ..., Kn E F. Recall from Lesson 8 
that the expression kv, + k2v2 + +++ + ky Vy is called a linear combination of the vectors v4, Vz, ..., Vn 
with weights k4, k2,..., ky. 


Also recall once more that v4, vz, ..., Vn are linearly dependent if there exist weights k,, k2,...,k, E F, 
with at least one weight nonzero, such that k,v, + k2v2 + =: + knYn = 0. Otherwise, we say that 
V1, V2, ++, Vn are linearly independent. 


In Problem 6 from Lesson 8, you were asked to prove that if a finite set of at least two vectors is linearly 
dependent, then one of the vectors in the set can be written as a linear combination of the other 
vectors in the set. To prove the next theorem (Theorem 16.11), we will need the following slightly 
stronger result. 


Lemma 16.10: Let V be a vector space over a field F and let v4, v3, ..., Vg E V be linearly dependent 
with k > 2. Also assume that v, # 0. Then there is t < k such that v, can be written as a linear 
combination of v4, V2, ..., Vt—1- 


Proof: Suppose that v4, 1,..., Vg are linearly dependent and v4 # 0. Let cyvy + CoV2 +*+ +CkVk = 0 
be a nontrivial dependence relation (in other words, not all the c; are 0). Since v; # 0, we must have 
ci # 0 for some i # 1 (otherwise c,V, + C2V2 + + CkVg = 0 implies cvi = 0, which implies that 
cı = 0, contradicting that the dependence relation is nontrivial). Let t be the largest value such that 
Cy #0. Then we have €,V1 + CoV2 + + +CkVk = C1V1 F CQV2 Fee + Cvt + OVE +s + OV,, and so, 
C1V1 + CV ++ + CV = O. Since c # 0, we can solve for v, to get 


Cy Ce-4 
Ve ==- ep eat 
t Ct 
So, v; can be written as a linear combination of v4, V2, ..., Ve_4- oO 
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Note: A lemma is a theorem whose primary purpose it to prove a more important theorem. Although 
Lemma 16.10 is an important result in Linear Algebra, the main reason we are mentioning it now is to 
help us prove the next theorem (Theorem 16.11). 


Theorem 16.11: Let V be a vector space over a field F, let T € L(V), and let 44,22, ..., Ay be distinct 
eigenvalues of T with corresponding eigenvectors V4, V2, ...,Vg. Then v4, V3, ...,Vg are linearly 
independent. 


Proof: Suppose toward contradiction that v4, V2, ..., Vg are linearly dependent. Let t be the least integer 
such that v, can be written as a linear combination of v14, V2, ..., V¢—1 (we can find such a t by Lemma 
16.10). Then there are weights c4, C2, ...,Cp_1 such that vz = c,Vy + C2V3 + + + Ct-1Vt-1. Apply the 
linear transformation T to each side of this last equation to get the equation 
T (vg) = Te, 04 + C22 +o + Ct-1Vt-1) = CT (11) + CoT (v2) +- + Ct-1T(vt-1). Since each v; is 
an eigenvector corresponding to eigenvalue 2;, we have AVi = Cy AqVy + CoAQV2 + + Ct-1ÀAt-1Vt-1.- 
We can also multiply each side of the equation ve = c,Vy + C2V2 + °°: + Ct-1Vt-1 by Ay to get the 
equation AVe = CyApVy + CoAgV2 + + + Ct-1AtVt-1. We now subtract: 
AeVe = Cy AeVz + CoARV2 Hee + Ce AV t-1 
ÀtVe = CyAqVy + C2À2V2 Hoe + Ct-1Åt-1Vt-1 
0 = (Ag — U)V + cale — A2)V2 +o + Ct- Ae — At-1) Vt- 


Since we chose t to be the least integer such that v, can be written as a linear combination of 
Vi, Va, a, Ve—1, it follows that v4, Vz, ...,V¢t-1 are linearly independent. Therefore, the constants 
Cy (Àt — Aq), C2 (At — Az) pe Ce (At — At-1) are all 0. Since the eigenvalues are all distinct, we must 


have cy = C3 =+ = Cy, = 0. Then ve = C11 + CgV2 ++ + Ce-1Vt-1 = 0, contradicting our 
assumption that v, is an eigenvector. Therefore, v4, V2, ..., Vg cannot be linearly dependent. So, 
V1, V2, ++, Vg are linearly independent. o 


Qi `° Ain 
: : |. The diagonal entries of A are the entries 

Ani ‘* Ann 

a11, A22, =, Ann. All other entries of A are nondiagonal entries. 


Let A be a square matrix, say A= 


1 5 2 
Example 16.10: The diagonal entries of the matrix B =|3 6 o are b41 = 1, bg. = 6, and b33 = 8. 
2 9 8 


The nondiagonal entries of B are b43 = 5, by3 = 2, bz, = 3, b23 = 0, b31 = 2, and bz; = 9. 
A diagonal matrix is a square matrix that has every nondiagonal entry equal to 0. 


Example 16.11: The matrix B from Example 16.10 is not a diagonal matrix, while the matrices 


1 0 0 5 0 0 
C=]0 6 O}andD=]0 -2 OJ are diagonal matrices. 
0 0 8 0 00 


Let V be a vector space. A linear transformation T € L(V) is said to be diagonalizable if there is a basis 
B of V for which M;,(B) is a diagonal matrix. 
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Example 16.12: 


1. Consider C as a vector space over R and define T: C > C by T(z) = 5z, as we did in part 1 of 
Example 16.1. The equation T(z) = 5z tells us that every nonzero vector z is an eigenvector 
corresponding to the eigenvalue A = 5. T(z) = Az is equivalent to 5z = Az or (A—5)z = 0. 
So, A=5 is the only eigenvalue. In particular, the standard basis vectors 1 and i are 
eigenvectors corresponding to the eigenvalue A = 5. We have 


TI) =5=5-14+0-i. 
Ti) = 5i=0-14+5-i. 
So, as we saw in part 1 of Example 16.6, the matrix of T with respect to the standard basis is 


Mr = F al a diagonal matrix. Therefore, T is diagonalizable. 


2. Consider R? as a vector space over R and define T: R? > R? by 
T((x, y, z)) = (3x + y,y — 2z,72z). 
Let’s find the eigenvalues and eigenvectors of T. 


We start by solving the equation T((x,y, z)) = A(x, y,z). This equation is equivalent to the 
three equations 3x + y = Ax, y — 2z = Ay, and 7z = Az. We work backwards. If z + 0, we get 
A=7.\fz =Oandy # 0, we get A = 1. Finally, if z = 0, y = 0, and x + 0, we getA = 3. 


So, the eigenvalues of T are 7, 1, and 3. 

If we let A = 7, we get 3x + y = 7x, y — 2z = 7y, and 7z = 7z. The equation y — 2z = 7y is 
equivalent to the equation 6y = - 2z, ory = -=2. The equation 3x + y = 7x is equivalent to 
4x =y= -=z, orx = -~z. So, if we let z = - 12, we get the eigenvector v, = (1, 4,- 12). 


If we let à = 1, we get 3x + y = x, y — 2z = y, and 7z = Z. The equation 7z = z is equivalent 
to the equation z = 0. The equation y — 2z = y is then equivalent to y = y. The equation 


3x + y = x is equivalent to 2x = -y or x = -Ży. So, if we let y = - 2, we get the eigenvector 
v = (1,-2, 0). 


If we let A = 3, we get 3x + y = 3x, y — 2z = 3y, and 7z = 3z. The equation 7z = 3z is 
equivalent to the equation z = 0. The equation y — 2z = 3y is then equivalent to y = 0. The 
equation 3x + y = 3x is then equivalent to x = x. So, if we let x = 1, we get the eigenvector 
v, = (1,0,0). 


It follows that B = {(1, 4,- 12), (1, - 2, 0), (1,0, 0)} is a basis of eigenvectors of V and we have 
T((1,4,-12)) = 7(1,4,- 12) 
T((1,-2,0)) = 1(1,- 2,0) 
T((1,0,0)) = 3(1,0,0) 


Therefore, the matrix of T with respect to B is M;(B) = 


Since M;(B) is a diagonal matrix, T is diagonalizable. 
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3. Recall from part 3 of Example 16.9, the linear transformation T:R? > R? defined by 
T((x, y)) = (-y,x) (where R? is being viewed as a vector space over the field IR). We saw in 
that example that this linear transformation has no eigenvalues. It follows that there is no basis 
for R? such that the matrix of T with respect to that basis is a diagonal matrix. In other words, 
T is not diagonalizable. 


However, in the same example, we saw that the linear transformation T: C? > C? defined by 
T:C? > C? by T((z,w)) = (-w,z) (where C? is being viewed as a vector space over the field 
C) has eigenvalues i and -i with eigenvectors corresponding to these eigenvalues of (1, - i) 
and (1, i), respectively. So, we have 


T((1,- 0) = i(1,-i) 
T((1 0) = -i(1,i) 


So, the matrix of T with respect to the basis B = {(1,- i), (1,i)} is M (B) = lo eI a 


diagonal matrix. Therefore, T is diagonalizable. 


We finish with a Theorem that gives a sufficient condition for a linear transformation to be 
diagonalizable. 


Theorem 16.12: Let V be an n-dimensional vector space and let T € L(V) have n distinct eigenvalues. 
Then T is diagonalizable. 


Proof: Suppose that dimV =n and T € L(V) has the n distinct eigenvalues 14,42, ..., An, with 
corresponding eigenvectors v4, V2, ..., Vn. By Theorem 16.11, v4, V2, ..., Vn are linearly independent. By 
the note following Theorem 16.8, v4, V2, ..., Vn can be extended to a basis of V. However, a basis of V 
has n elements and therefore, {v1,V2,...,U,} is already a basis of V. Since T(v,) = àv, 


A, 0 .. 0 
; O Ar. Ofa : 
T (U2) = A2V2,.-4T Un) = AnVn, it follows that Mr (B) =|. 4 l . |. Since M; (B) is a diagonal 
0 O .. An 
matrix, T is diagonalizable. oO 
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Problem Set 16 


Full solutions to these problems are available for free download here: 


www.SATPrepGet800.com/PMFBXSG 
LEVEL 1 


1. Let V and W be vector spaces over R. Determine if each of the following functions is a linear 
transformation: 


(i) f:R— R defined by f(x) = 2x+1 
(ii) g:R > R? defined by g(x) = (2x,3x) 
(iii) h: R? > R? defined by h((x, y,z)) = (x+y,x+z,z-y) 


2. Compute each of the following: 


1 13 0 
20 - 
(i) h -4 2 o 
Boii -4 29 
-4 
(ii) [3 -1 s1-|-7] 
2 
-4 
(iii) 7|- 1s -1 5] 
2 
a b c] f1 0 1 
(iv) jd e ff 2 | 
g h il 13 1 4 


LEVEL 2 


3. Consider C as a vector space over itself. Give an example of a function f: C > C such that f is 
additive, but not a linear transformation. Then give an example of vector spaces V and W and a 
homogenous function g: V > W that is not a linear transformation. 


LEVEL 3 


4. Let P = {ax* + bx + c | a,b,c E R} be the vector space of polynomials of degree 2 with real 
coefficients (see part 3 of Example 8.3 from Lesson 8). Define the linear transformation 
D:P > P by D(ax? + bx + c) = 2ax + b. Find the matrix of T with respect to each of the 
following bases: 


(i) The standard basis B = {1, x, x7} 
GD C={x+1,x? +1,x? +x} 
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5. Let V and W be vector spaces with V finite-dimensional, let U < V, and let T € L(U, W). Prove 
that there is an S € L(V, W) such that S(v) = T(v) for all v E U. 


LEVEL 4 


6. Let T:V —> W bea linear transformation and let v4, V2, ..., Vn E V. Prove the following: 


G) If T is injective and v4,¥V,...,V¥, are linearly independent in V, then 
T (v1), T (v2), ..., TW) are linearly independent in W. 


(ii) If T is surjective and span{v,, V2, ..., Vn} = V, then span{T (v1), T(v2), ..., TW) } = W. 
7. Determine if each linear transformation is diagonalizable: 

(i) T:R? > R? defined by T((x, y)) = (y, 2x) 

(ii) U:C? > C? defined by U((z,w)) = (z + iw, iz—w). 


8. Let V and W be vector spaces over a field F. Prove that L(V, W) is a vector space over F, where 
addition and scalar multiplication are defined as in Theorem 16.2. 


9. Let V bea vector space over a field F. Prove that L(V) is a linear algebra over F, where addition 
and scalar multiplication are defined as in Theorem 16.2 and vector multiplication is given by 
composition of linear transformations. 


10. Let T:V > W and S:W > V be linear transformations such that ST = iy and TS = iy. Prove 
that S and T are bijections and that S = T~?. 


11. Let V and W be finite-dimensional vector spaces and let T € L(V, W). Prove the following: 
(i) IfdimV < dim W, then T is not surjective 
(ii) IfdimV > dim W, then T is not injective. 


12. Prove that two finite-dimensional vector spaces over a field F are isomorphic if and only if they 
have the same dimension. 


13. Let T € L(V) be invertible and let 2 € F \ {0}. Prove that A is an eigenvalue of T if and only if 
: is an eigenvalue of T~?. 


LEVEL 5 


14. Let V be a vector space with dim V > 1. Show that {T € L(V) | T is not invertible} ¢ L(V). 


15. Let V be an n-dimensional vector space over a field F. Prove that there is a linear algebra 
isomorphism F: £(V) > ME,,. 
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Abelian, 33 

Abelian group, 35 

Absolute value, 82 
Absorption law, 109 
Accessible space, 197 
Accumulation point, 91 
Additive function, 234 
Algebraically closed field, 78 
Almost ring, 146 

Angle, 212 

Angle in standard position, 212 
Antireflexive, 120 
Antisymmetric, 120 
Archimedean Property, 60 


Argument of a complex 
number, 216 


Associative, 32 

Associative law, 109 
Assumption, 109 

Atomic statement, 9, 107 
Automorphism, 146 

Axiom of Extensionality, 26 
Ball, 201 

Base case, 44 

Basis, 102, 104 

Basis for a topology, 192 
Biconditional, 11 
Biconditional elimination, 114 
Biconditional introduction, 114 
Biconditional law, 109 
Bijection, 126 

Bijective function, 126 
Binary connective, 10 
Binary operation, 30 
Binary relation, 119, 137 
Boundary point, 92 
Bounded, 58 

Bounded above, 58 
Bounded below, 58 
Bounded interval, 64 
Canonical form, 162 


INDEX 


Canonical representation, 162 
Cantor-Schroeder-Bernstein 
Theorem, 133 

Cantor’s Theorem, 131 
Cardinality, 20 

Cartesian product, 30, 119 
Chains of topologies, 191 
Circle, 85, 212 
Circumference, 212 

Clopen, 200 

Closed disk, 86 

Closed downwards, 33, 98 
Closed interval, 64 

Closed set, 74, 89, 189 
Closing statement, 21 
Closure, 31, 35 

Coarser topology, 190 
Codomain, 125 

Cofinite topology, 198 
Common divisor, 159 
Common factor, 159 
Common multiple, 159 
Commutative, 33 
Commutative group, 35 
Commutative law, 109 
Compact space, 203 
Comparability condition, 124 
Complement, 74, 89 


Complete prime factorization, 
164 


Completeness, 58 
Completeness Property, 60 
Complex number, 78 
Composite number, 152 
Composite function, 129 
Compound statement, 9 
Conclusion, 109, 111 
Conditional, 11 
Conditional law, 109 
Conjugate, 81, 147 
Conjunction, 11 
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Conjunctive elimination, 114 
Conjunctive introduction, 114 
Connective, 9 

Constant, 137 

Constant function, 125 
Constructive dilemma, 114 
Continuity, 174, 177, 204, 226 


Continuous at a point, 174, 
177, 205, 226 


Continuous function, 174, 204 
Contradiction, 109 
Contrapositive, 113 
Contrapositive law, 109 
Converse, 113 

Corollary, 130 

Cosine, 214 

Countable, 131 
Counterexample, 31 
Cover of a topology, 194 
Covering, 203 

Cycle diagram, 148 

Cycle notation, 148 

De Moivre’s Theorem, 217 
De Morgan’s laws, 11, 77, 109 
Deleted neighborhood, 87 
Dense, 61 

Density Theorem, 61 
Denumerable, 131 
Dependence relation, 104 
Derivation, 114 
Destructive dilemma, 114 
Diagonal entry, 250 
Diagonal matrix, 250 
Diagonalizable, 250 
Difference identity, 232 
Dilation, 234 

Dimension, 244 

Discrete topology, 190 
Disjoint, 25, 70 
Disjunction, 11 

Disjunctive introduction, 114 


Disjunctive resolution, 114 
Disjunctive syllogism, 114 
Disk, 86 

Distance, 83 

Distance function, 201 
Distributive, 39, 77 
Distributive law, 107 
Distributivity, 40 

Divides, 42, 152 

Divisible, 42, 152 
Divisibility, 41 

Division Algorithm, 155, 156 
Divisor, 42, 152 

Domain, 125, 137 

Double negation, 108 
Eigenvalue, 247 
Eigenvector, 247 
Element, 19 

Empty set, 20 
Equinumerosity, 130 
Equivalence class, 122 
Equivalence relation, 121 
Euclidean Algorithm, 165 
Euler’s formula, 216 
Even, 41, 47 

Exclusive or, 17 


Exponential form of a complex 
number, 216 


Extended Complex Plane, 229 
Factor, 42, 152 

Factor tree, 162 

Factorial, 154 
Factorization, 153 

Fallacy, 112 

Fallacy of the converse, 113 
Fallacy of the inverse, 113 
Fence-post formula, 20 
Field, 41, 50 

Field axioms, 50, 51 

Field homomorphism, 143 
Finer topology, 190 

Finitary operation, 137 
Finitary relation, 137 


Finite-dimensional vector 
space, 244 

Finite sequence, 126 
Fixed point, 221 
Function, 124, 218 
Fundamental Theorem of 
Arithmetic, 152 


Gaussian integer, 150 

GCD, 159 

Greatest common divisor, 159 
Greatest common factor, 159 
Greatest lower bound, 58 
Group, 34 

Group homomorphism, 142 
Half-open interval, 64 
Hausdorff space, 198 
Homeomorphic spaces, 207 
Homeomorphism, 207 
Homogenous function, 234 
Homomorphism, 142 
Horizontal strip, 169 
Hypothesis, 109 
Hypothetical syllogism, 114 
Ideal, 149 

Identity, 34 

Identity function, 130 
Identity law, 109 

Image, 146, 204, 244 
Imaginary part, 79 
Implication, 11 

Inclusion map, 134 
Incomparable topologies, 190 
Indiscrete topology, 190 
Induced topology, 202 
Induction, 43 

Inductive hypothesis, 44 
Inductive step, 44 

Infimum, 58, 59 

Infinite closed interval, 64 
Infinite interval, 64 

Infinite limit, 183 

Infinite open interval, 64 
Infinite sequence, 126 
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Infinite set, 19 

Initial ray, 212 

Injection, 126 

Injective function, 126 
Integer, 19 

Interior point, 92 
Intersection, 24, 66, 69 
Intersection containment 
property, 194 

Interval, 64 

Invalid, 112 

Invariant, 220 

Invariant subspace, 247 
Inverse, 34, 113 

Inverse function, 127 
Invertible, 34 
Isomorphism, 55, 145 
Kernel, 146, 244 
Kolmogorov space, 197 
LCM, 159 

Least common multiple, 159 
Least upper bound, 58 
Left distributivity, 39 
Lemma, 250 

Limit, 172, 176, 177, 223 
Limits involving infinity, 183 
Linear algebra, 237 


Linear combination, 101, 103, 
160 


Linear dependence, 102, 104 
Linear equation, 78 

Linear function, 179 

Linear independence, 102, 104 
Linear transformation, 234 
Linearly ordered set, 124 
Logical argument, 111 
Logical connective, 9 

Logical equivalence, 108 
Lower bound, 58 

Matrix, 97, 239 

Matrix addition, 97 


Matrix of a linear 
transformation, 242 


Matrix multiplication, 240, 241 
Matrix scalar multiplication, 97 
Metric, 201 

Metric space, 201 
Metrizable space, 202 
Modulus, 82 

Modus ponens, 112, 114 
Modus tollens, 114 
Monoid, 34 

Monoid homomorphism, 142 
Monotonic function, 143 
Multiple, 42, 152 

Mutually exclusive, 25 
Mutually relatively prime, 160 
Natural number, 19 
Negation, 11 

Negation law, 109 
Negative identities, 215 
Neighborhood, 86 
Nondiagonal entry, 250 
Normal, 147 

Normal space, 200 

Normal subgroup, 147 
North pole, 228 

Null space, 244 

Nullity, 246 

Odd, 47 

One-sided limit, 185 
One-to-one function, 126 
Onto, 126 

Open ball, 201 

Open covering, 203 

Open disk, 86 

Open interval, 64 

Open rectangle, 170 

Open set, 71, 87, 189 
Opening statement, 21 
Order homomorphism, 143 
Ordered field, 52 

Ordered pair, 118 

Ordered ring, 52 

Ordered tuple, 118 
Ordering, 124 


Pairwise disjoint, 70, 121 
Pairwise relatively prime, 160 
Parity, 121 

Partial binary operation, 31 
Partial ordering, 124 

Partially ordered set, 124 
Partition, 121 

Permutation, 148 

Point at infinity, 228 


Polar form of a complex 
number, 216 


Polynomial equation, 78 
Polynomial ring, 151 
Poset, 124 

Positive square root, 82 
Power set, 23 

Premise, 109, 111 
Prime factorization, 153 
Prime number, 152 


Principle of Mathematical 
Induction, 43 


Principle root, 218 
Product, 41 

Product topology, 210 
Proof, 111 

Proof by contradiction, 44 
Proof by contrapositive, 129 
Proposition, 9, 107 
Propositional variable, 10, 107 
Punctured disk, 87 

Pure imaginary number, 79 
Pythagorean identity, 215 
Pythagorean Theorem, 56 
Quadrantal angles, 214 
Quadratic equation, 78 
Quotient, 35, 81, 155 
Radian measure, 212 
Range, 125, 244 

Rank, 246 

Rational number, 35 

Ray, 212 

Real number, 60, 79 

Real part, 79 
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Redundancy law, 109 
Reflection, 221 
Reflexive, 29, 120 
Regular space, 200 
Relation, 119, 120, 137 
Relatively prime, 159 


Representative of equivalence 
class, 123 


Riemann sphere, 228 

Right distributivity, 39 

Ring, 39 

Ring axioms, 40 

Ring homomorphism, 143 
Ring ideal, 149 

Ring with identity, 40 

Rng, 147 

Root of a complex number, 218 
Roots of unity, 218 
Rotation, 222 

Rule of inference, 112 
SACT, 45 

Scalar multiplication, 93, 95 
Semigroup, 32 


Semigroup homomorphism, 
142 


Semiring, 41 
Separation axioms, 197 
Sequence, 126 

Set, 19 

Set-builder notation, 20 
Set complement, 74 
Set difference, 66 
Sigma notation, 241 
Simple subspace, 247 
Sine, 214 

Soundness, 113 

South pole, 228 

Span, 101, 103 

Square matrix, 240 
Square root, 82, 218 


Standard Advanced Calculus 
Trick, 45 


Standard form of a complex 
number, 78, 216 


Standard topology, 192 
Statement, 9, 107 

Strict linearly ordered set, 124 
Strict partial ordering, 124 
Strict partially ordered set, 124 
Strict poset, 124 

Strip, 169 

Strong Induction, 49 

Subbasis for a topology, 196 
Subfield, 80, 141 

Subgroup, 140 

Submonoid, 139 

Subring, 140 

Subsemigroup, 139 

Subset, 20 

Subspace, 98 

Subspace topology, 210 
Substatement, 107 


Substitution of logical 
equivalents, 109 


Substitution of sentences, 109 
Substructure, 139 

Sum, 41 

Sum identities, 215 
Summation, 241 


Supremum, 59 

Surjection, 126 

Surjective function, 126 
Surjectively invariant, 220 
Symmetric, 29, 120 
Symmetric difference, 66 
Tangent, 214 
Tautologically implies, 112 
Tautology, 22, 109 
Terminal ray, 212 

Ternary relation, 120, 137 
Theorem, 21 

Tichonov space, 197 
Topological equivalence, 207 
Topological invariant, 209 
Topological property, 209 
Topological space, 189 
Topology, 85, 189 

Totally ordered set, 124 
Transitive, 24, 120 
Transitivity of logical 
equivalence, 109 
Translation, 220 

Tree diagram, 23 

Triangle Inequality, 84 
Trichotomy, 124 
Trigonometric functions, 214 
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Trivial topology, 190 

Truth table, 12 

Type, 138 

Unary connective, 10 

Unary relation, 120, 137 
Uncountable, 131 

Uniform continuity, 180, 227 
Uniformly continuous, 180, 227 
Union, 24, 66, 69 

Unit circle, 212 

Unital ring, 40 

Universal Quantifier, 21 
Universal set, 21 

Universal statement, 33, 98 
Unordered pair, 118 

Upper bound, 58 

Valid, 112 

Vector, 79, 95 

Vector multiplication, 237 
Vector space, 93 

Venn diagram, 21 

Vertical strip, 169 

Weight, 101, 103, 160 
Well-defined, 123 

Well Ordering Principle, 43 
Without loss of generality, 73 
Wrapping function, 214 
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Pure Mathematics for Beginners consists of a series of lessons in logic, 
set theory, abstract algebra, number theory, real analysis, topology, 
complex analysis, and linear algebra. 


This book is perfect for any college level course intended to introduce 
students to proving theorems in higher level mathematics. Due to the 
diverse amount of content covered, instructors can easily create a wide 
range of courses by simply choosing from among the 16 lessons in the 
book. 


High school and college students that want to begin learning advanced 
mathematics on their own will also find this book to be quite useful. The 
book is completely self-contained with no prerequisites. Furthermore, 
proofs are presented without “skipping over any steps” and many 
examples and additional analyses of theorems are presented to help 
clarify material that many students ordinarily find difficult. 


